Search Results

Search found 2568 results on 103 pages for 'mup sys'.

Page 44/103 | < Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51  | Next Page >

  • .tex file remains in use by process when batch file is triggered by .Rnw Sweave processing.

    - by drknexus
    This is a pretty specialized question. I'm using the Eclipse IDE in a Windows XP environment with the StatET plug-in so I can write R code as an R/Sweave document. This produces a .tex file that is then post processed by pdflatex.exe. When I create the file as normal everything works great (except maybe my file named russfnc2.Rnw seems to result in russfnc.pdf even though pdflatex.exe on the console window correctly says that the output is being writen to russfnc2.pdf). The big problem is when I trigger a batch file from within my Rnw code. My goal here is to spawn a side process that waits for the PDF to be made and uploads it to the server. So the Rnw contains: if(file.exists("rsp.finalize.bat")) {system("rsp.finalize.bat",wait=FALSE,invisible=FALSE)} The batch file calls Rterm.exe to run a script: setwd("C:/theprojectdirectory") while(!file.exists("russfnc.pdf")) { Sys.sleep(1) } Sys.sleep(60) At the end of that script, I use a shell call to launch psftp.exe and upload the files. All of this works fine, when I use my Eclipse profile to trigger Sweave... that is unless I have that batch file at the end of the .Rnw. When it is located there, I get the error message pdflatex.exe: Permission denied: c:\thepath\thetexfile.tex. After that, the .tex file (as far as XP is concerned) is in use by another process and I have to reboot in order to delete it (and, of course, the pdf is not made). If I manually trigger the batch file after pdflatex.exe has done its things, everything works fine. How can I make this work correctly using the tools I'm familiar with vis., R and Dos-style batch files? I'm not sure if this is a SuperUser question or a StackOverflow question, so I'm starting here.

    Read the article

  • RHEL Java Application returns "No space left on device" but only 3% used

    - by FiveO
    My Java Application returns following Exception when saving a new file in /opt/wso2 on a CentOS 6.4: Caused by java.io.FileNotFoundException: ... (No space left on device) Caused by: java.io.FileNotFoundException: /opt/wso2/FrameworkFiles/trk_2014062500042488825_TRCK_PatfallHospis_pFromHospis_66601fb3-a03c-4149-93c3-6892e0a10fea.txt (No space left on device) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.<init>(FileOutputStream.java:212) at java.io.FileOutputStream.<init>(FileOutputStream.java:99) at com.avintis.esb.framework.adapter.wso2.FrameworkAdapterWSO2.sendMessages(FrameworkAdapterWSO2.java:634) ... 23 more But when I run df -a I can see that the partition still has plenty of space available: [root@stzsi466 wso2]# df -a Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_stzsi466-lv_root 12054824 2116092 9326380 19% / proc 0 0 0 - /proc sysfs 0 0 0 - /sys devpts 0 0 0 - /dev/pts tmpfs 4030764 0 4030764 0% /dev/shm /dev/sda1 495844 53858 416386 12% /boot /dev/sdb1 51605436 1424288 47559744 3% /opt/wso2 none 0 0 0 - /proc/sys/fs/binfmt_misc [root@stzsi466 ~]# df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_stzsi466-lv_root 765536 45181 720355 6% / tmpfs 1007691 1 1007690 1% /dev/shm /dev/sda1 128016 44 127972 1% /boot /dev/sdb1 3276800 6137 3270663 1% /opt/wso2 What is the problem here? Is it caused by the Java on CentOS 6.4? I have another server running Redhat REHL 6.4 and all works fine - same Java etc. Does anyone know of this problem?

    Read the article

  • Celery daemon as a Ubuntu service does not consume tasks while running from terminal does

    - by Guy
    On Ubuntu 11.10, I have to issue python tasks from django using celery. I'm currently testing on the same machine but eventually the celery worker should run on a remote machine. django uses the following settings: BROKER_HOST = "127.0.0.1" BROKER_PORT = 5672 BROKER_VHOST = "/my_vhost" BROKER_USER = "celery" BROKER_PASSWORD = "celery" I can also see my task queued in http://localhost:55672/#/queues the celery daemon uses the following configuration (celeryconfig.py): BROKER_HOST = "127.0.0.1" BROKER_PORT = 5672 BROKER_USER = "celery" BROKER_PASSWORD = "celery" BROKER_VHOST = "/my_vhost" CELERY_RESULT_BACKEND = "amqp" import os import sys sys.path.append(os.getcwd()) CELERY_IMPORTS = ("tasks", ) running celeryd -l info works well and now I want to run it as a service. I've followed the instructions from http://ask.github.com/celery/cookbook/daemonizing.html and now I'm trying to run it using: sudo /etc/init.d/celeryd start But the message is not being consumed, no error in the celery log either. /etc/default/celeryd CELERYD_NODES="w1" CELERYD_CHDIR="/path/to/django/project" CELERYD_OPTS="--time-limit=300 --concurrency=1" CELERY_CONFIG_MODULE="celeryconfig" # %n will be replaced with the nodename. CELERYD_LOG_FILE="/var/log/celery/%n.log" CELERYD_PID_FILE="/var/run/celery/%n.pid" # Workers should run as an unprivileged user. CELERYD_USER="celery" CELERYD_GROUP="celery" I've also created user celery in Ubuntu not sure if its necessary. Any help will be appreciated, Thanks, Guy

    Read the article

  • Configuring wsgi for a simple Python based site

    - by jbbarnes
    I have an Ubuntu 10.04 server that already has apache and wsgi working. I also have a python script that works just fine using the make_server command: if __name__ == '__main__': from wsgiref.simple_server import make_server srv = make_server('', 8080, display_status) srv.serve_forever() Now I would like to have the page always active without having to run the script manually. I looked at what Moin is doing. I found these lines in apache2.conf: WSGIScriptAlias /wiki /usr/local/share/moin/moin.wsgi WSGIDaemonProcess moin user=www-data group=www-data processes=5 threads=10 maximum-requests=1000 umask=0007 WSGIProcessGroup moin And moin.wsgi is as listed: import sys, os sys.path.insert(0, '/usr/local/share/moin') from MoinMoin.web.serving import make_application application = make_application(shared=True) QUESTION: Can I create a similar section in apache2.conf pointing to another wsgi file? Like this: WSGIScriptAlias /status /mypath/status.wsgi WSGIDaemonProcess status user=www-data group=www-data processes=5 threads=10 maximum-requests=1000 umask=0007 WSGIProcessGroup status And if so, what is required to convert my simple_server script into a daemonized process? Most of the information I find about wsgi is related to using it with frameworks like Django. I haven't found a simple howto detailing how to make this work. Thanks.

    Read the article

  • Deploying a Django application in a virtual Ubuntu Server

    - by mfsaint
    I have a virtualbox machine running Ubuntu Server 10.04LTS. My intention is to this machine to work like a VPS, this way I can learn and prepare for when I get a VPS service. Apache+mod_wsgi for deploying the Django app seems the right choice to me. I have the domain (marianofalcon.com.ar) but nothing else, no DNS. The problem is that I'm pretty lost with all the deployment stuff. I know how to configure mod_wsgi(with the django.wsgi file) and apache(creating a VirtualHost). Something is missing and I don't know what it is. I think that I lack networking skills ant that's the big problem. Trying to host the app on a virtualbox adds some difficulty because I don't know well what IP to use. This is what I've got: file placed at: /etc/apache2/sites-available: NameVirtualHost *:80 <VirtualHost *:80> ServerAdmin [email protected] ServerName www.my-domain.com ServerAlias my-domain.com Alias /media /path/to/my/project/media DocumentRoot /path/to/my/project WSGIScriptAlias / /path/to/your/project/apache/django.wsgi ErrorLog /var/log/apache2/error.log LogLevel warn CustomLog /var/log/apache2/access.log combined </VirtualHost> django.wsgi file: import os, sys wsgi_dir = os.path.abspath(os.path.dirname(__file__)) project_dir = os.path.dirname(wsgi_dir) sys.path.append(project_dir) project_settings = os.path.join(project_dir,'settings') os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings' import django.core.handlers.wsgi application = django.core.handlers.wsgi.WSGIHandler()

    Read the article

  • Ubuntu "No space left on device" for /home, df shows 100% full, ds shows much, much less

    - by Jon Cram
    On an Ubuntu 12.04 server, normal users can no longer create or add to files in /home, encountering a "No space left on device" error. The /home directory has a capacity of 1.7 terabytes and as far as I can tell is nowhere near full in terms of actual data stored or inodes used. df -h shows: Filesystem Size Used Avail Use% Mounted on /dev/md2 1.0T 18G 955G 2% / udev 7.7G 4.0K 7.7G 1% /dev tmpfs 3.1G 320K 3.1G 1% /run none 5.0M 0 5.0M 0% /run/lock none 7.7G 0 7.7G 0% /run/shm cgroup 7.7G 0 7.7G 0% /sys/fs/cgroup /dev/md3 1.7T 1.7T 0 100% /home /dev/md1 496M 45M 426M 10% /boot /home indeed looks rather full. du -hs /home suggests otherwise: 1.4G /home There appears no inode issue - df -i: Filesystem Inodes IUsed IFree IUse% Mounted on /dev/md2 67108864 75334 67033530 1% / udev 2013497 527 2012970 1% /dev tmpfs 2015816 440 2015376 1% /run none 2015816 2 2015814 1% /run/lock none 2015816 1 2015815 1% /run/shm cgroup 2015816 9 2015807 1% /sys/fs/cgroup /dev/md3 113909760 105981 113803779 1% /home /dev/md1 131072 239 130833 1% /boot I recently deleted a many gigabytes of application cache and log data from /home, however this was in the tens of gigabytes at best and nowhere near the capcity of /home. Update 1: du -hs --apparent-size /home 1.2G /home du -hs /home 1.4G /home What might be going on here?

    Read the article

  • IPTABLE & IP-routed netwok solution for HOST net and VM's subnet

    - by Daniel
    I've got ProxmoxVE2.1 ruled KVM node on Debian and bunch of VM's guests machine. That is how my networking looks like: # network interface settings auto lo iface lo inet loopback # device: eth0 auto eth0 iface eth0 inet static address 175.219.59.209 gateway 175.219.59.193 netmask 255.255.255.224 post-up echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp And I've got two working subnet solution auto vmbr0 iface vmbr0 inet static address 10.10.0.1 netmask 255.255.0.0 bridge_ports none bridge_stp off bridge_fd 0 post-up ip route add 10.10.0.1/24 dev vmbr0 This way I can reach internet, to resolve outside hosts, update and download everything I need but can't reach one guest VM out of any other VM's inside my network. The second solution allows me to communicate between VM's: auto vmbr1 iface vmbr1 inet static address 10.10.0.1 netmask 255.255.255.0 bridge_ports none bridge_stp off bridge_fd 0 post-up echo 1 > /proc/sys/net/ipv4/ip_forward post-up iptables -t nat -A POSTROUTING -s '10.10.0.0/24' -o vmbr1 -j MASQUERADE post-down iptables -t nat -D POSTROUTING -s '10.10.0.0/24' -o vmbr1 -j MASQUERADE I can even NAT internal addresses: -t nat -I PREROUTING -p tcp --dport 789 -j DNAT --to-destination 10.10.0.220:345 My inexperienced mind is ready to double VM's net adapters: one for the first solution and another - for second (with slightly different adresses) but I'm pretty sure that it's a dumb way to resolve the problem and everything can be resolved via iptables/ip route rules that I can't create. I've tried a dozen of "wizard manuals" and "howto's" to mix both solution but without success. Looking for an advice (and good reading links for networking begginers).

    Read the article

  • On linux, what does it mean when a directory has size 0 instead of 4096?

    - by kdt
    Here's a strange thing I haven't seen before -- a directory whose size is reported by ls as 0 instead of 4096, and I can't create any files within it. # ls -ld lib home drwxr-xr-x. 2 root root 0 Feb 7 03:10 home <-- it has zero size dr-xr-xr-x. 11 root root 4096 Feb 4 09:28 lib # touch home/foo touch: cannot touch `home/foo': No such file or directory <-- and I can't create files in it # rm home rm: cannot remove `home': Is a directory <-- look, it really is a dir So what does it mean for a directory to have size 0 instead of 4096? Filesystem is ext4 on fedora core 14. The output of mount is: /dev/mapper/vg_dev-lv_root on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0") /dev/vda1 on /boot type ext4 (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) Output of du -s /home: 0 /home Output of stat /home: File: `/home' Size: 0 Blocks: 0 IO Block: 1024 directory Device: 15h/21d Inode: 34913 Links: 2 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2011-02-07 03:45:46.188995765 -0800 Modify: 2011-02-07 03:11:59.980995019 -0800 Change: 2011-02-06 07:58:45.874995002 -0800

    Read the article

  • Windows 7 Paging file apparently not being used

    - by Daniel F.
    I'm running Windows 7 Home Premium 32bit on a mobo with 24GB RAM. Of those 24GB, 20GB are assigned as a RAMDISK via ASRock XFastRAM. This RAMDISK has the drive letter X assigned to it. On X:\ I'm storing the temporary files folder, as well as pagefile.sys. Pagefile.sys has 6GB of size. The X:\ has usually around 14GB free space, so the temporary files are negligible, it's mostly the browsers which are storing their caches on there. Now my issue is that Firefox is crashing a lot on me, no error message pops up, but I know that this is because it's out of memory. I could kind of live with that, but now that I switched from using Eclipse to Android Studio, I know that I'm in trouble, because Java isn't capable of allocating, and Android Studio, together with the Java instances it launches, is quite a memory hog. So I tried to figure out what's wrong, and apparently Windows isn't swapping out memory onto the paging file. While my applications are crashing (firefox) / not starting (java vm's), the paging file is only using constantly around 15% of its size (checked with the performance monitor). 15% equals to 1GB aprox. I know that the correct solution would be to switch to 64 bit Windows, but I had to use the 32 bit version because of driver issues which I had about two years ago, and I guess that I'll have them again if I reformat and install the 64 bit version. Also, the machine is running quite stable, the only issue is the memory, so I'd like to use it as it is (as the apps are installed and configured) Is there a way to make Windows use the paging file more efficiently? None of my processes require more than 1GB, I'd just like it to swap out some seldomly used stuff, like GoogleCrashHandler.exe and stuff like that in order to have "more physical memory avaliable". Is that possible?

    Read the article

  • MBR seems to be gone

    - by bobobobo
    So, horror story for everyone. I bought two spanking new HDD's. MM!! Gbitage. I removed all my old HDD's, physically labelled them, and was preparing to install all new HDD's (fresh sys install included!) To make sure what HDD was what, I popped each OLD HDD (data filleD!) into a Thermaltake Blacx toaster.. surprisingly BOTH couldn't be read. I didn't have static on my hands! I'm certain of it. I touched metal, touched wood, before beginning this all. Thinking that was strage, I hauled up the new sys, installed Win XP (of course!) on the new HDD, and now the two OLD HDD's (data filled!) that were entered into the toaster cannot be read. And they had tons of data on them. I read about MBR's being nuked and it sounds like that is what it is. But I'm at a loss what to do. There are so many MBR recovery programs out there, I kind of feel overwhelmed. I don't want to lose my data by just pikcing one, yet it seems so close within reach, I'm not panicking anymore.. Anybody have a play by play that I could follow? I just don't want to spend $900 on data recovery centers if I can do this myself..

    Read the article

  • Ubuntu 10.04 on virtualbox gives error: Target filesystem doesn't have /sbin/init \ No init found. Try passing init= bootarg

    - by Philip
    I'm a linux newbie and the only reason I have it installed is so I can stop having Windows incompatibility issues with Ruby on Rails. Having said that, it sure has been nice, and much faster, and I don't think I'll be doing any Winrails stuff anytime soon. So I created a virtualmachine using virtualbox and have had ubuntu on it for the last 3 weeks. Recently ubuntu asked if it could update a few things, I clicked 'ok'. Now it won't boot and I get this error: *mount: mounting /dev on /root/dev failed: No such file or directory mount: mounting /sys on /root/sys failed: No such file or directory ... Target filesystem doesn't have /sbin/init. No init found. Try passing init= bootarg BusyBox v1.13.3... (initramfs) _ * So I cruised the forums and there are a variety of solutions, but they all have to do with booting from the live cd. (which I assume is the ISO image I used to install ubuntu in the first place). But when I boot from that CD, it just hangs on the ubuntu screen, and the little dots keep cycling white to red, but it hung there for an hour so I think it was stuck. Not sure what I can do; can I do anything from the busybox shell (or whatever that is) to fix things? The thing is, it took about 10 hours to get everything the way I needed with all the gems and whatnot. And I didn't really write down what I tweaked, and I'm middle aged, so all that information has leaked out by now and I don't want to do it again. I'd really like to repair my existing install. One question you might have is, is there something wrong with the ISO? I don't think so, because I made a new virtual machine and used that same iso file to install a fresh ubuntu. Any help much appreciated. Phil

    Read the article

  • How can I start hostednetwork on Windows 7?

    - by Pirozek
    When I type in admin console command to start hostednetwork netsh wlan start hostednetwork it gives me this: The hosted network couldn't be started. The group or resource is not in the correct state to perform the requested operation. There is a hotfix from Microsoft but it didn't help me. Any advice? C:\Users\Pirozek>netsh wlan show driver Interface name: Wireless Network Connection 3 Driver : D-Link AirPlus DWL-G520 Wireless PCI Adapter(rev .B) Vendor : Atheros Communications Inc. Provider : Atheros Communications Inc. Date : 8.7.2009 Version : 8.0.0.171 INF file : C:\Windows\INF\oem108.inf Files : 2 total C:\Windows\system32\DRIVERS\athrx.sys C:\Windows\system32\drivers\vwifibus.sys Type : Native Wi-Fi Driver Radio types supported : 802.11b 802.11g FIPS 140-2 mode supported : Yes Hosted network supported : Yes Authentication and cipher supported in infrastructure mode: Open None Open WEP-40bit Shared WEP-40bit Open WEP-104bit Shared WEP-104bit Open WEP Shared WEP WPA-Enterprise TKIP WPA-Personal TKIP WPA2-Enterprise TKIP WPA2-Personal TKIP Vendor defined TKIP WPA2-Enterprise Vendor defined Vendor defined Vendor defined WPA-Enterprise CCMP WPA-Personal CCMP WPA2-Enterprise CCMP Vendor defined CCMP WPA2-Enterprise Vendor defined Vendor defined Vendor defined WPA2-Personal CCMP Authentication and cipher supported in ad-hoc mode: Open None Open WEP-40bit Open WEP-104bit Open WEP WPA2-Personal CCMP

    Read the article

  • DBCC CHECKDB fails and quits job, ambiguous error message.

    - by ddono25
    I received a notice that one of our servers' DBCC CHECKDB for all databases has been failing the past four times it has been run. We don't have any data prior to that, but it doesn't look like it has been succeeding for awhile. There are no errors in the log file only: DBCC results for 'sys.sysxmlfacet'. [SQLSTATE 01000] Msg 0, Sev 0, State 1: Unspecified error occurred on SQL Server. Connection may have been terminated by the server. [SQLSTATE HY000] There are 112 rows in 1 pages for object "sys.sysxmlfacet". [SQLSTATE 01000] I ran a DBCC CHECKDB using sp_MSForEachDB to get more accurate results and had the same error on the same DB but at a separate point: DBCC results for 'NameValuePair_Greek_CI_AS'. [SQLSTATE 01000] Msg 0, Sev 0, State 1: Unspecified error occurred on SQL Server. Connection may have been terminated by the server. [SQLSTATE HY000] There are 0 rows in 0 pages for object "NameValuePair_Greek_CI_AS". [SQLSTATE 01000] Also, the error-log states that the DBCC completed without errors for this database. I can't figure out how to track down this ambiguous issue that only happens on this database out of the dozens on this server. Any help is appreciated!

    Read the article

  • The Web Hosting Connundrum for "not quite" developers

    - by saltcod
    Hey all, Apologies if this post feels like its been covered elsewhere, but I don't think it has. I've been down a winding web hosting road. To date, I've tried: Joyent, Media Temple, Bluehost, Hostgator, and finally Linode. The reason for switching are likely obvious to everyone: speed. With the exception of the lightening fast Linode, all of the shared hosts are absolutely sloooow. What do do when you're not really a "developer" While I'v grown addicted to the speed of Linode, I really don't feel like its where I should be. I have this nagging feeling in the back of my mind that one of these days (likely soon), I'm going to run into something that i won't be able to figure out and i'll have days worth of downtime. Just the other day, for example, I realized that one of my domains wasn't sending emails. After 4(!) hours looking into the problem, I still can't get sendmail or postfix to work. Four hours!! I want to be a Drupal expert, not a Ubuntu expert That's really the heart of my problem: I spend way too much time learning Ubuntu's ins-and-outs, and not nearly enough time working on Drupal. So here goes: Is there a web host out there anywhere that offers the speed of Linode, but will let me focus on Drupal instead of sys-admin-ing? Thanks! [ I know, I know. There are going to be lots of people who read this saying - "just learn Ubuntu like a real developer". And I get that. I do. But when I work full-time and try and develop some of these sites in my evenings and weekends, I'm really feeling like the sys-admin stuff gets in the way.

    Read the article

  • HPET missing from available clocksources on CentOS

    - by squareone
    I am having trouble using HPET on my physical machine. It is not available, even though I have enabled it in my bios, forced it in grub, and triple checked my kernel to include HPET in its compilation. Motherboard: Supermicro X9DRW Processor: 2x Intel(R) Xeon(R) CPU E5-2640 SAS Controller: LSI Logic / Symbios Logic SAS2004 PCI-Express Fusion-MPT SAS-2 [Spitfire] (rev 03) Distro: CentOS 6.3 Kernel: 3.4.21-rt32 #2 SMP PREEMPT RT x86_64 GNU/Linux Grub: hpet=force clocksource=hpet .config file: CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y CONFIG_HPET=y dmesg | grep hpet: Command line: ro root=/dev/mapper/vg_xxxx-lv_root rd_NO_LUKS rd_LVM_LV=vg_xxxx/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_xxxx/lv_swap rd_NO_DM LANG=en_US.UTF-8 rhgb quiet panic=5 hpet=force clocksource=hpet Kernel command line: ro root=/dev/mapper/vg_xxxx-lv_root rd_NO_LUKS rd_LVM_LV=vg_xxxx/lv_root KEYBOARDTYPE=pc KEYTABLE=us rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=vg_xxxx/lv_swap rd_NO_DM LANG=en_US.UTF-8 rhgb quiet panic=5 hpet=force clocksource=hpet cat /sys/devices/system/clocksource/clocksource0/current_clocksource: tsc cat /sys/devices/system/clocksource/clocksource0/available_clocksource: tsc jiffies What is even more confusing, is that I have about a dozen other machines that utilize the same kernel .config, and can use HPET fine. I fear it is a hardware issue, but would appreciate any advice or help with getting HPET available. Thanks in advance!

    Read the article

  • Disk operations in windows 7 are slow

    - by Skadlig
    My computer started lagging last Sunday. I tried to reboot it and it failed. Trying to boot into failsafe mode takes around two hours. It mainly freezes on two files: scsiport.sys and classpnp.sys When it finally has started all disc operations are really slow. When it has run for a while it goes faster, probably due to data moved into RAM instead. It froze on an other file before that was associated with Avast but uninstalling it didn't really help. A critical windows update was installed on Sunday but rolling back the update didn't help. I had a guess about the sound card but disabling the sound card drivers also didn’t help. I have an inkling of an idea that it might be Intel rapid storage technology that might be acting up but it doesn't allow me to reinstall it from failsafe mode and I haven't been able to log into normal mode for a while. I would appreciate suggestions regarding how to get into normal mode again and/or what can be the root cause.

    Read the article

  • Solving embarassingly parallel problems using Python multiprocessing

    - by gotgenes
    How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

    Read the article

  • How to know if the client has terminated in sockets

    - by shadyabhi
    Suppose, I have a connected socket after writing this code.. if ((sd = accept(socket_d, (struct sockaddr *)&client_addr, &alen)) < 0) { perror("accept failed\n"); exit(1); } How can I know at the server side that client has exited. My whole program actually does the following.. Accepts a connection from client Starts a new thread that reads messages from that particular client and then broadcast this message to all the connected clients. If you want to see the whole code... In this whole code. I am also struggling with one more problem that whenever I kill a client with Ctrl+C, my server terminates abruptly.. It would be nice if anyone could suggest what the problem is.. #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <netdb.h> #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #include <signal.h> #include <errno.h> #include <pthread.h> /*CONSTANTS*/ #define DEFAULT_PORT 10000 #define LISTEN_QUEUE_LIMIT 6 #define TOTAL_CLIENTS 10 #define CHAR_BUFFER 256 /*GLOBAL VARIABLE*/ int current_client = 0; int connected_clients[TOTAL_CLIENTS]; extern int errno; void *client_handler(void * socket_d); int main(int argc, char *argv[]) { struct sockaddr_in server_addr;/* structure to hold server's address*/ int socket_d; /* listening socket descriptor */ int port; /* protocol port number */ int option_value; /* needed for setsockopt */ pthread_t tid[TOTAL_CLIENTS]; port = (argc > 1)?atoi(argv[1]):DEFAULT_PORT; /* Socket Server address structure */ memset((char *)&server_addr, 0, sizeof(server_addr)); server_addr.sin_family = AF_INET; /* set family to Internet */ server_addr.sin_addr.s_addr = INADDR_ANY; /* set the local IP address */ server_addr.sin_port = htons((u_short)port); /* Set port */ /* Create socket */ if ( (socket_d = socket(PF_INET, SOCK_STREAM, 0)) < 0) { fprintf(stderr, "socket creation failed\n"); exit(1); } /* Make listening socket's port reusable */ if (setsockopt(socket_d, SOL_SOCKET, SO_REUSEADDR, (char *)&option_value, sizeof(option_value)) < 0) { fprintf(stderr, "setsockopt failure\n"); exit(1); } /* Bind a local address to the socket */ if (bind(socket_d, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) { fprintf(stderr, "bind failed\n"); exit(1); } /* Specify size of request queue */ if (listen(socket_d, LISTEN_QUEUE_LIMIT) < 0) { fprintf(stderr, "listen failed\n"); exit(1); } memset(connected_clients,0,sizeof(int)*TOTAL_CLIENTS); for (;;) { struct sockaddr_in client_addr; /* structure to hold client's address*/ int alen = sizeof(client_addr); /* length of address */ int sd; /* connected socket descriptor */ if ((sd = accept(socket_d, (struct sockaddr *)&client_addr, &alen)) < 0) { perror("accept failed\n"); exit(1); } else printf("\n I got a connection from (%s , %d)\n",inet_ntoa(client_addr.sin_addr),ntohs(client_addr.sin_port)); if (pthread_create(&tid[current_client],NULL,(void *)client_handler,(void *)sd) != 0) { perror("pthread_create error"); continue; } connected_clients[current_client]=sd; current_client++; /*Incrementing Client number*/ } return 0; } void *client_handler(void *connected_socket) { int sd; sd = (int)connected_socket; for ( ; ; ) { ssize_t n; char buffer[CHAR_BUFFER]; for ( ; ; ) { if (n = read(sd, buffer, sizeof(char)*CHAR_BUFFER) == -1) { perror("Error reading from client"); pthread_exit(1); } int i=0; for (i=0;i<current_client;i++) { if (write(connected_clients[i],buffer,sizeof(char)*CHAR_BUFFER) == -1) perror("Error sending messages to a client while multicasting"); } } } } My client side is this (Maye be irrelevant while answering my question) #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <string.h> #include <stdlib.h> void error(char *msg) { perror(msg); exit(0); } void *listen_for_message(void * fd) { int sockfd = (int)fd; int n; char buffer[256]; bzero(buffer,256); printf("YOUR MESSAGE: "); fflush(stdout); while (1) { n = read(sockfd,buffer,256); if (n < 0) error("ERROR reading from socket"); if (n == 0) pthread_exit(1); printf("\nMESSAGE BROADCAST: %sYOUR MESSAGE: ",buffer); fflush(stdout); } } int main(int argc, char *argv[]) { int sockfd, portno, n; struct sockaddr_in serv_addr; struct hostent *server; pthread_t read_message; char buffer[256]; if (argc < 3) { fprintf(stderr,"usage %s hostname port\n", argv[0]); exit(0); } portno = atoi(argv[2]); sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd < 0) error("ERROR opening socket"); server = gethostbyname(argv[1]); if (server == NULL) { fprintf(stderr,"ERROR, no such host\n"); exit(0); } bzero((char *) &serv_addr, sizeof(serv_addr)); serv_addr.sin_family = AF_INET; bcopy((char *)server->h_addr, (char *)&serv_addr.sin_addr.s_addr, server->h_length); serv_addr.sin_port = htons(portno); if (connect(sockfd,&serv_addr,sizeof(serv_addr)) < 0) error("ERROR connecting"); bzero(buffer,256); if (pthread_create(&read_message,NULL,(void *)listen_for_message,(void *)sockfd) !=0 ) { perror("error creating thread"); } while (1) { fgets(buffer,255,stdin); n = write(sockfd,buffer,256); if (n < 0) error("ERROR writing to socket"); bzero(buffer,256); } return 0; }

    Read the article

  • C sockets, chat server and client, problem echoing back.

    - by wretrOvian
    Hi This is my chat server : #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <string.h> #define LISTEN_Q 20 #define MSG_SIZE 1024 struct userlist { int sockfd; struct sockaddr addr; struct userlist *next; }; int main(int argc, char *argv[]) { // declare. int listFD, newFD, fdmax, i, j, bytesrecvd; char msg[MSG_SIZE], ipv4[INET_ADDRSTRLEN]; struct addrinfo hints, *srvrAI; struct sockaddr_storage newAddr; struct userlist *users, *uptr, *utemp; socklen_t newAddrLen; fd_set master_set, read_set; // clear sets FD_ZERO(&master_set); FD_ZERO(&read_set); // create a user list users = (struct userlist *)malloc(sizeof(struct userlist)); users->sockfd = -1; //users->addr = NULL; users->next = NULL; // clear hints memset(&hints, 0, sizeof hints); // prep hints hints.ai_family = AF_INET; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_PASSIVE; // get srver info if(getaddrinfo("localhost", argv[1], &hints, &srvrAI) != 0) { perror("* ERROR | getaddrinfo()\n"); exit(1); } // get a socket if((listFD = socket(srvrAI->ai_family, srvrAI->ai_socktype, srvrAI->ai_protocol)) == -1) { perror("* ERROR | socket()\n"); exit(1); } // bind socket bind(listFD, srvrAI->ai_addr, srvrAI->ai_addrlen); // listen on socket if(listen(listFD, LISTEN_Q) == -1) { perror("* ERROR | listen()\n"); exit(1); } // add listfd to master_set FD_SET(listFD, &master_set); // initialize fdmax fdmax = listFD; while(1) { // equate read_set = master_set; // run select if(select(fdmax+1, &read_set, NULL, NULL, NULL) == -1) { perror("* ERROR | select()\n"); exit(1); } // query all sockets for(i = 0; i <= fdmax; i++) { if(FD_ISSET(i, &read_set)) { // found active sockfd if(i == listFD) { // new connection // accept newAddrLen = sizeof newAddr; if((newFD = accept(listFD, (struct sockaddr *)&newAddr, &newAddrLen)) == -1) { perror("* ERROR | select()\n"); exit(1); } // resolve ip if(inet_ntop(AF_INET, &(((struct sockaddr_in *)&newAddr)->sin_addr), ipv4, INET_ADDRSTRLEN) == -1) { perror("* ERROR | inet_ntop()"); exit(1); } fprintf(stdout, "* Client Connected | %s\n", ipv4); // add to master list FD_SET(newFD, &master_set); // create new userlist component utemp = (struct userlist*)malloc(sizeof(struct userlist)); utemp->next = NULL; utemp->sockfd = newFD; utemp->addr = *((struct sockaddr *)&newAddr); // iterate to last node for(uptr = users; uptr->next != NULL; uptr = uptr->next) { } // add uptr->next = utemp; // update fdmax if(newFD > fdmax) fdmax = newFD; } else { // existing sockfd transmitting data // read if((bytesrecvd = recv(i, msg, MSG_SIZE, 0)) == -1) { perror("* ERROR | recv()\n"); exit(1); } msg[bytesrecvd] = '\0'; // find out who sent? for(uptr = users; uptr->next != NULL; uptr = uptr->next) { if(i == uptr->sockfd) break; } // resolve ip if(inet_ntop(AF_INET, &(((struct sockaddr_in *)&(uptr->addr))->sin_addr), ipv4, INET_ADDRSTRLEN) == -1) { perror("* ERROR | inet_ntop()"); exit(1); } // print fprintf(stdout, "%s\n", msg); // send to all for(j = 0; j <= fdmax; j++) { if(FD_ISSET(j, &master_set)) { if(send(j, msg, strlen(msg), 0) == -1) perror("* ERROR | send()"); } } } // handle read from client } // end select result handle } // end looping fds } // end while return 0; } This is my client: #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <string.h> #define MSG_SIZE 1024 int main(int argc, char *argv[]) { // declare. int newFD, bytesrecvd, fdmax; char msg[MSG_SIZE]; fd_set master_set, read_set; struct addrinfo hints, *srvrAI; // clear sets FD_ZERO(&master_set); FD_ZERO(&read_set); // clear hints memset(&hints, 0, sizeof hints); // prep hints hints.ai_family = AF_INET; hints.ai_socktype = SOCK_STREAM; hints.ai_flags = AI_PASSIVE; // get srver info if(getaddrinfo(argv[1], argv[2], &hints, &srvrAI) != 0) { perror("* ERROR | getaddrinfo()\n"); exit(1); } // get a socket if((newFD = socket(srvrAI->ai_family, srvrAI->ai_socktype, srvrAI->ai_protocol)) == -1) { perror("* ERROR | socket()\n"); exit(1); } // connect to server if(connect(newFD, srvrAI->ai_addr, srvrAI->ai_addrlen) == -1) { perror("* ERROR | connect()\n"); exit(1); } // add to master, and add keyboard FD_SET(newFD, &master_set); FD_SET(STDIN_FILENO, &master_set); // initialize fdmax if(newFD > STDIN_FILENO) fdmax = newFD; else fdmax = STDIN_FILENO; while(1) { // equate read_set = master_set; if(select(fdmax+1, &read_set, NULL, NULL, NULL) == -1) { perror("* ERROR | select()"); exit(1); } // check server if(FD_ISSET(newFD, &read_set)) { // read data if((bytesrecvd = recv(newFD, msg, MSG_SIZE, 0)) < 0 ) { perror("* ERROR | recv()"); exit(1); } msg[bytesrecvd] = '\0'; // print fprintf(stdout, "%s\n", msg); } // check keyboard if(FD_ISSET(STDIN_FILENO, &read_set)) { // read data from stdin if((bytesrecvd = read(STDIN_FILENO, msg, MSG_SIZE)) < 0) { perror("* ERROR | read()"); exit(1); } msg[bytesrecvd] = '\0'; // send if((send(newFD, msg, bytesrecvd, 0)) == -1) { perror("* ERROR | send()"); exit(1); } } } return 0; } The problem is with the part where the server recv()s data from an FD, then tries echoing back to all [send() ]; it just dies, w/o errors, and my client is left looping :(

    Read the article

  • Microbenchmark showing process-switching faster than thread-switching; what's wrong?

    - by Yang
    I have two simple microbenchmarks trying to measure thread- and process-switching overheads, but the process-switching overhead. The code is living here, and r1667 is pasted below: https://assorted.svn.sourceforge.net/svnroot/assorted/sandbox/trunk/src/c/process_switch_bench.c // on zs, ~2.1-2.4us/switch #include <stdlib.h> #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <semaphore.h> #include <unistd.h> #include <sys/wait.h> #include <sys/types.h> #include <sys/time.h> #include <pthread.h> uint32_t COUNTER; pthread_mutex_t LOCK; pthread_mutex_t START; sem_t *s0, *s1, *s2; void * threads ( void * unused ) { // Wait till we may fire away sem_wait(s2); for (;;) { pthread_mutex_lock(&LOCK); pthread_mutex_unlock(&LOCK); COUNTER++; sem_post(s0); sem_wait(s1); } return 0; } int64_t timeInMS () { struct timeval t; gettimeofday(&t, NULL); return ( (int64_t)t.tv_sec * 1000 + (int64_t)t.tv_usec / 1000 ); } int main ( int argc, char ** argv ) { int64_t start; pthread_t t1; pthread_mutex_init(&LOCK, NULL); COUNTER = 0; s0 = sem_open("/s0", O_CREAT, 0022, 0); if (s0 == 0) { perror("sem_open"); exit(1); } s1 = sem_open("/s1", O_CREAT, 0022, 0); if (s1 == 0) { perror("sem_open"); exit(1); } s2 = sem_open("/s2", O_CREAT, 0022, 0); if (s2 == 0) { perror("sem_open"); exit(1); } int x, y, z; sem_getvalue(s0, &x); sem_getvalue(s1, &y); sem_getvalue(s2, &z); printf("%d %d %d\n", x, y, z); pid_t pid = fork(); if (pid) { pthread_create(&t1, NULL, threads, NULL); pthread_detach(t1); // Get start time and fire away start = timeInMS(); sem_post(s2); sem_post(s2); // Wait for about a second sleep(1); // Stop thread pthread_mutex_lock(&LOCK); // Find out how much time has really passed. sleep won't guarantee me that // I sleep exactly one second, I might sleep longer since even after being // woken up, it can take some time before I gain back CPU time. Further // some more time might have passed before I obtained the lock! int64_t time = timeInMS() - start; // Correct the number of thread switches accordingly COUNTER = (uint32_t)(((uint64_t)COUNTER * 2 * 1000) / time); printf("Number of process switches in about one second was %u\n", COUNTER); printf("roughly %f microseconds per switch\n", 1000000.0 / COUNTER); // clean up kill(pid, 9); wait(0); sem_close(s0); sem_close(s1); sem_unlink("/s0"); sem_unlink("/s1"); sem_unlink("/s2"); } else { if (1) { sem_t *t = s0; s0 = s1; s1 = t; } threads(0); // never return } return 0; } https://assorted.svn.sourceforge.net/svnroot/assorted/sandbox/trunk/src/c/thread_switch_bench.c // From <http://stackoverflow.com/questions/304752/how-to-estimate-the-thread-context-switching-overhead> // on zs, ~4-5us/switch; tried making COUNTER updated only by one thread, but no difference #include <stdlib.h> #include <stdint.h> #include <stdio.h> #include <pthread.h> #include <unistd.h> #include <sys/time.h> uint32_t COUNTER; pthread_mutex_t LOCK; pthread_mutex_t START; pthread_cond_t CONDITION; void * threads ( void * unused ) { // Wait till we may fire away pthread_mutex_lock(&START); pthread_mutex_unlock(&START); int first=1; pthread_mutex_lock(&LOCK); // If I'm not the first thread, the other thread is already waiting on // the condition, thus Ihave to wake it up first, otherwise we'll deadlock if (COUNTER > 0) { pthread_cond_signal(&CONDITION); first=0; } for (;;) { if (first) COUNTER++; pthread_cond_wait(&CONDITION, &LOCK); // Always wake up the other thread before processing. The other // thread will not be able to do anything as long as I don't go // back to sleep first. pthread_cond_signal(&CONDITION); } pthread_mutex_unlock(&LOCK); return 0; } int64_t timeInMS () { struct timeval t; gettimeofday(&t, NULL); return ( (int64_t)t.tv_sec * 1000 + (int64_t)t.tv_usec / 1000 ); } int main ( int argc, char ** argv ) { int64_t start; pthread_t t1; pthread_t t2; pthread_mutex_init(&LOCK, NULL); pthread_mutex_init(&START, NULL); pthread_cond_init(&CONDITION, NULL); pthread_mutex_lock(&START); COUNTER = 0; pthread_create(&t1, NULL, threads, NULL); pthread_create(&t2, NULL, threads, NULL); pthread_detach(t1); pthread_detach(t2); // Get start time and fire away start = timeInMS(); pthread_mutex_unlock(&START); // Wait for about a second sleep(1); // Stop both threads pthread_mutex_lock(&LOCK); // Find out how much time has really passed. sleep won't guarantee me that // I sleep exactly one second, I might sleep longer since even after being // woken up, it can take some time before I gain back CPU time. Further // some more time might have passed before I obtained the lock! int64_t time = timeInMS() - start; // Correct the number of thread switches accordingly COUNTER = (uint32_t)(((uint64_t)COUNTER * 2 * 1000) / time); printf("Number of thread switches in about one second was %u\n", COUNTER); printf("roughly %f microseconds per switch\n", 1000000.0 / COUNTER); return 0; }

    Read the article

  • C socket programming: select() is returning 0 despite messages sent from server

    - by Fantastic Fourier
    Hey all, I'm using select() to recv() messages from server, using TCP/IP. When I send() messages from the server, it returns a reasonable number of bytes, saying it's sent successful. And it does get to the client successfully when I use while loop to just recv(). Everything is fine and dandy. while(1) recv() // obviously pseudocode However, when I try to use select(), select() returns 0 from timeout (which is set to 1 second) and for the life of me I cannot figure out why it doesn't see the messages sent from the server. I should also mention that when the server disconnects, select() doesn't see that either, where as if I were to use recv(), it would return 0 to indicate that the connection using the socket has been closed. Any inputs or thoughts are deeply appreciated. #include <arpa/inet.h> #include <errno.h> #include <fcntl.h> #include <netdb.h> #include <netinet/in.h> #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <strings.h> #include <sys/select.h> #include <sys/socket.h> #include <sys/time.h> #include <sys/types.h> #include <time.h> #include <unistd.h> #define SERVER_PORT 10000 #define MAX_CONNECTION 20 #define MAX_MSG 50 struct client { char c_name[MAX_MSG]; char g_name[MAX_MSG]; int csock; int host; // 0 = not host of a multicast group struct sockaddr_in client_address; struct client * next_host; struct client * next_client; }; struct fd_info { char c_name[MAX_MSG]; int socks_inuse[MAX_CONNECTION]; int sock_fd, max_fd; int exit; struct client * c_sys; struct sockaddr_in c_address[MAX_CONNECTION]; struct sockaddr_in server_address; struct sockaddr_in client_address; fd_set read_set; }; struct message { char c_name[MAX_MSG]; char g_name[MAX_MSG]; char _command[3][MAX_MSG]; char _payload[MAX_MSG]; struct sockaddr_in client_address; struct client peer; }; int main(int argc, char * argv[]) { char * host; char * temp; int i, sockfd; int msg_len, rv, ready; int connection, management, socketread; int sockfds[MAX_CONNECTION]; // for three threads that handle new connections, user inputs and select() for sockets pthread_t connection_handler, manager, socket_reader; struct sockaddr_in server_address, client_address; struct hostent * hserver, cserver; struct timeval timeout; struct message msg; struct fd_info info; info.exit = 0; // exit information: if exit = 1, threads quit info.c_sys = NULL; // looking up from the host database if (argc == 3) { host = argv[1]; // server address strncpy(info.c_name, argv[2], strlen(argv[2])); // client name } else { printf("plz read the manual, kthxbai\n"); exit(1); } printf("host is %s and hp is %p\n", host, hserver); hserver = gethostbyname(host); if (hserver) { printf("host found: %s\n", hserver->h_name ); } else { printf("host not found\n"); exit(1); } // setting up address and port structure information on serverside bzero((char * ) &server_address, sizeof(server_address)); // copy zeroes into string server_address.sin_family = AF_INET; memcpy(&server_address.sin_addr, hserver->h_addr, hserver->h_length); server_address.sin_port = htons(SERVER_PORT); bzero((char * ) &client_address, sizeof(client_address)); // copy zeroes into string client_address.sin_family = AF_INET; client_address.sin_addr.s_addr = htonl(INADDR_ANY); client_address.sin_port = htons(SERVER_PORT); // opening up socket sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd < 0) exit(1); else { printf("socket is opened: %i \n", sockfd); info.sock_fd = sockfd; } // sets up time out option for the bound socket timeout.tv_sec = 1; // seconds timeout.tv_usec = 0; // micro seconds ( 0.5 seconds) setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof(struct timeval)); // binding socket to a port rv = bind(sockfd, (struct sockaddr *) &client_address, sizeof(client_address)); if (rv < 0) { printf("MAIN: ERROR bind() %i: %s\n", errno, strerror(errno)); exit(1); } else printf("socket is bound\n"); printf("MAIN: %li \n", client_address.sin_addr.s_addr); // connecting rv = connect(sockfd, (struct sockaddr *) &server_address, sizeof(server_address)); info.server_address = server_address; info.client_address = client_address; info.sock_fd = sockfd; info.max_fd = sockfd; printf("rv = %i\n", rv); if (rv < 0) { printf("MAIN: ERROR connect() %i: %s\n", errno, strerror(errno)); exit(1); } else printf("connected\n"); fd_set readset; FD_ZERO(&readset); FD_ZERO(&info.read_set); FD_SET(info.sock_fd, &info.read_set); while(1) { readset = info.read_set; printf("MAIN: %i \n", readset); ready = select((info.max_fd)+1, &readset, NULL, NULL, &timeout); if(ready == -1) { sleep(2); printf("TEST: MAIN: ready = -1. %s \n", strerror(errno)); } else if (ready == 0) { sleep(2); printf("TEST: MAIN: ready = 0. %s \n", strerror(errno)); } else if (ready > 0) { printf("TEST: MAIN: ready = %i. %s at socket %i \n", ready, strerror(errno), i); for(i = 0; i < ((info.max_fd)+1); i++) { if(FD_ISSET(i, &readset)) { rv = recv(sockfd, &msg, 500, 0); if(rv < 0) continue; else if(rv > 0) printf("MAIN: TEST: %s %s \n", msg._command[0], msg._payload); else if (rv == 0) { sleep(3); printf("MAIN: TEST: SOCKET CLOSEDDDDDD \n"); } FD_CLR(i, &readset); } } } info.read_set = readset; } // close connection close(sockfd); printf("socket closed. BYE! \n"); return(0); }

    Read the article

  • RHEL - blocked FC remote port time out: saving binding

    - by Dev G
    My Server went into a faulty state since the database could not write on the partition. I found out that the partition went into Read Only mode. Finally to fix it, I had to do a hard reboot. Linux 2.6.18-164.el5PAE #1 SMP Tue Aug 18 15:59:11 EDT 2009 i686 i686 i386 GNU/Linux /var/log/messages Oct 31 00:56:45 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 00:57:05 ota3g1 Had[17275]: VCS CRITICAL V-16-1-50086 CPU usage on ota3g1.mtsallstream.com is 100% Oct 31 01:01:47 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 01:06:50 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 01:11:52 ota3g1 Had[17275]: VCS ERROR V-16-1-10214 Concurrency Violation:CurrentCount increased above 1 for failover group sg_network Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x2 received Data: x2 x20 x80000 x0 x0 Oct 31 01:12:10 ota3g1 kernel: lpfc 0000:29:00.1: 1:1303 Link Up Event x3 received Data: x3 x1 x10 x1 x0 x0 0 Oct 31 01:12:12 ota3g1 kernel: lpfc 0000:29:00.1: 1:1305 Link Down Event x4 received Data: x4 x20 x80000 x0 x0 Oct 31 01:12:40 ota3g1 kernel: rport-8:0-0: blocked FC remote port time out: saving binding Oct 31 01:12:40 ota3g1 kernel: lpfc 0000:29:00.1: 1:(0):0203 Devloss timeout on WWPN 20:25:00:a0:b8:74:f5:65 NPort x0000e4 Data: x0 x7 x0 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 38617577 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283532153 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 90825 Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-16. Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 868841 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-10. Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37759889 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283349449 Oct 31 01:12:40 ota3g1 kernel: printk: 6 messages suppressed. Oct 31 01:12:40 ota3g1 kernel: Aborting journal on device dm-12. Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_reserve_inode_write: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 1545 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 12745 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-10, logical block 1545 Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_reserve_inode_write: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-10 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-12) in ext3_dirty_inode: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37757897 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 1097 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-12 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0 Oct 31 01:12:40 ota3g1 kernel: lost page write due to I/O error on dm-16 Oct 31 01:12:40 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:40 ota3g1 kernel: EXT3-fs error (device dm-16) in ext3_dirty_inode: Journal has aborted Oct 31 01:12:40 ota3g1 kernel: end_request: I/O error, dev sdi, sector 37749121 Oct 31 01:12:40 ota3g1 kernel: Buffer I/O error on device dm-12, logical block 0 Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-12 Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 Oct 31 01:12:41 ota3g1 kernel: end_request: I/O error, dev sdi, sector 283337089 Oct 31 01:12:41 ota3g1 kernel: Buffer I/O error on device dm-16, logical block 0 Oct 31 01:12:41 ota3g1 kernel: lost page write due to I/O error on dm-16 Oct 31 01:12:41 ota3g1 kernel: sd 8:0:0:4: SCSI error: return code = 0x00010000 df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/cciss-root 4.9G 730M 3.9G 16% / /dev/mapper/cciss-home 9.7G 1.2G 8.1G 13% /home /dev/mapper/cciss-var 9.7G 494M 8.8G 6% /var /dev/mapper/cciss-usr 15G 2.6G 12G 19% /usr /dev/mapper/cciss-tmp 3.9G 153M 3.6G 5% /tmp /dev/sda1 996M 43M 902M 5% /boot tmpfs 5.9G 0 5.9G 0% /dev/shm /dev/mapper/cciss-product 25G 16G 7.4G 68% /product /dev/mapper/cciss-opt 20G 4.5G 14G 25% /opt /dev/mapper/dg_db1-vol_db1_system 18G 2.2G 15G 14% /database/OTADB/sys /dev/mapper/dg_db1-vol_db1_undo 18G 5.8G 12G 35% /database/OTADB/undo /dev/mapper/dg_db1-vol_db1_redo 8.9G 4.3G 4.2G 51% /database/OTADB/redo /dev/mapper/dg_db1-vol_db1_sgbd 8.9G 654M 7.8G 8% /database/OTADB/admin /dev/mapper/dg_db1-vol_db1_arch 98G 24G 69G 26% /database/OTADB/arch /dev/mapper/dg_db1-vol_db1_indexes 240G 14G 214G 6% /database/OTADB/index /dev/mapper/dg_db1-vol_db1_data 275G 47G 215G 18% /database/OTADB/data /dev/mapper/dg_dbrman-vol_db_rman 8.9G 351M 8.1G 5% /database/RMAN /dev/mapper/dg_app1-vol_app1 151G 113G 31G 79% /files/ota /etc/fstab /dev/cciss/root / ext3 defaults 1 1 /dev/cciss/home /home ext3 defaults 1 2 /dev/cciss/var /var ext3 defaults 1 2 /dev/cciss/usr /usr ext3 defaults 1 2 /dev/cciss/tmp /tmp ext3 defaults 1 2 LABEL=/boot /boot ext3 defaults 1 2 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/cciss/swap swap swap defaults 0 0 /dev/cciss/product /product ext3 defaults 1 2 /dev/cciss/opt /opt ext3 defaults 1 2 /dev/dg_db1/vol_db1_system /database/OTADB/sys ext3 defaults 1 2 /dev/dg_db1/vol_db1_undo /database/OTADB/undo ext3 defaults 1 2 /dev/dg_db1/vol_db1_redo /database/OTADB/redo ext3 defaults 1 2 /dev/dg_db1/vol_db1_sgbd /database/OTADB/admin ext3 defaults 1 2 /dev/dg_db1/vol_db1_arch /database/OTADB/arch ext3 defaults 1 2 /dev/dg_db1/vol_db1_indexes /database/OTADB/index ext3 defaults 1 2 /dev/dg_db1/vol_db1_data /database/OTADB/data ext3 defaults 1 2 /dev/dg_dbrman/vol_db_rman /database/RMAN ext3 defaults 1 2 /dev/dg_app1/vol_app1 /files/ota ext3 defaults 1 2 Thanks for all the help.

    Read the article

  • arp problems with transparent bridge on linux

    - by Mink
    I've been trying to secure my virtual machines on my esx server by putting them behind a transparent bridge with 2 interfaces, one in front, one at the back. My intention is to put all the firewall rules in one place (instead of on each virtual server). I've been using as bridge a blank new virtual machine based on arch linux (but I suspect it doesn't matter which brand of linux it is). What I have is 2 virtual switchs (thus two Virtual Network, VN_front and VN_back), each with 2 types of ports (switched/separated or promiscious/where the machine can see all packets). On my bridge machine, I've set up 2 virtual NIC, one on VN_front, one on VN_back, both in promisc mode. I've created a bridge br0 with both NIC in it: brctl addbr br0 brctl stp br0 off brctl addif br0 front_if brctl addif br0 back_if Then brought them up: ifconfig front_if 0.0.0.0 promisc ifconfig back_if 0.0.0.0 promisc ifconfig br0 0.0.0.0 (I use promisc mode, because I'm not sure I can do without, thinking that maybe the packets don't reach the NICs) Then I took one of my virtual server sitting on VN_front, and plugged it to VN_back instead (that's the nifty use case I'm thinking about, being able to move my servers around just by changing the VN they are plugged into, without changing anything in the configuration). Then I looked into the macs "seen" by my addressless bridge using brctl showmacs br0 and it did show my server from both sides: I get something that looks like this : port no mac addr is local? ageing timer 2 00:0c:29:e1:54:75 no 9.27 1 00:0c:29:fd:86:0c no 9.27 2 00:50:56:90:05:86 no 73.38 1 00:50:56:90:05:88 no 0.10 2 00:50:56:90:05:8b yes 0.00 << FRONT VN 1 00:50:56:90:05:8c yes 0.00 << BACK VN 2 00:50:56:90:19:18 no 13.55 2 00:50:56:90:3c:cf no 13.57 the thing is that the server that are plugged in front/back are not shown on the correct port. I suspect some horrible thing happening in the ARP-world... :-/ If I ping from a front virtual server to a back virtual server, I can only see the back machine if that back machine pings something in the front. As soon as I stop the ping from the back machine, the ping from the front machine stops getting through... I've noticed that if the back machine pings, then its port on the bridge is the correct one... I've tried to play with the arp_ switch of /proc/sys, but with no clear effect on the end result... /proc/sys/net/ipv4/ip_forward doesn't seem to be of any use when using a bridge (seems it's all taken care of by brctl) /proc/sys/net/ipv4/conf//arp_ don't seem to change much either... (tried arp_announce to 2 or 8 - like suggested elsewhere - and arp_ignore to 0 or 1 ) All the examples I've seen have a different subnet on either side like 10.0.1.0/24 and 10.0.2.0/24... In my case I want 10.0.1.0/24 on both side (just like a transparent switch - except it's a hidden fw ). Turning stp on/off doesn't seem to have any impact on my issue. It's as if the arp packets where getting through the bridge, corrupting the other side with false data... I've tried to use the -arp on each interface, br0, front, back... it breaks the thing altogether... I suspect it has something to do with both side being on the same subnet... I've thought about putting all my machine behind the fw, so as to have all the same subnet at the back... but I'm stuck with my provider's gateway standing at the front with part of my subnet (in fact 3 appliance to route the whole subnet), so I'll always have ips from the same subnet on both side, whatever I do... (I'm using fixed front IPs on my delegated subnet). I'm at a loss... -_-'' Thx for your help. (As anyone tried something like this? from within ESXi?) (It's not just a stunt, the idea is to have something like fail2ban running on some servers, sending their banned IP to the bridge/fw so that it too could ban them - saving all the other servers from that same attacker in one go, allowing for some honeypot that would trigger the fw from any kind of suitable response, and stuffs of the sort... I am aware I could use something like snort, but it addresses some completely different kind of problems, in a completely different way... )

    Read the article

  • Linux Kernel crash mutex_lock_slowpath "blocked for more than 120 seconds". What to do?

    - by Roddick
    I have out-of-the box Debian Lenny with non-custom kernel 2.6.26-2-amd64. Brand new server that is used to 5% of it's potential, CPU and Disk-wise. Meaning it probably not crashing because of overload. every few days it freezes with hundreds of these messages in console log: : [284847.828428] INFO: task apache2:12473 blocked for more than 120 seconds. : [284847.868468] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. : [284847.912759] apache2 D ffff8101bc6b7ab0 0 12473 14358 : [284847.912763] ffff810160d5bc50 0000000000000082 ffff8101c0002e40 0000000000000000 : [284847.912766] ffff8101a7c42950 ffff810327d92810 ffff8101a7c42bd8 0000000400000044 : [284847.912770] ffff8101c0002e40 00000000000612d0 0000000000000000 00000040000612d0 : [284847.912773] Call Trace: : [284847.912786] [<ffffffff80429b0d>] __mutex_lock_slowpath+0x64/0x9b : [284847.912790] [<ffffffff80429972>] mutex_lock+0xa/0xb : [284847.912794] [<ffffffff802a20b9>] do_lookup+0x82/0x1c1 : [284847.912800] [<ffffffff802a4271>] __link_path_walk+0x87a/0xd19 : [284847.912805] [<ffffffff80295844>] kmem_getpages+0x96/0x15f : [284847.912808] [<ffffffff80295fb7>] ____cache_alloc_node+0x6d/0x106 : [284847.912814] [<ffffffff802a4756>] path_walk+0x46/0x8b : [284847.912819] [<ffffffff802a4a82>] do_path_lookup+0x158/0x1cf : [284847.912822] [<ffffffff802a3879>] getname+0x140/0x1a7 : [284847.912827] [<ffffffff802a53f1>] __user_walk_fd+0x37/0x4c : [284847.912831] [<ffffffff8029e381>] vfs_lstat_fd+0x18/0x47 : [284847.912840] [<ffffffff8029e3c9>] sys_newlstat+0x19/0x31 : [284847.912848] [<ffffffff8020beda>] system_call_after_swapgs+0x8a/0x8f Almost all traces has __mutex_lock_slowpath as top-level. Only some has different trace: : [284847.737386] INFO: task apache2:12472 blocked for more than 120 seconds. : [284847.777551] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. : [284847.824881] apache2 D ffff8101bc6b7ab0 0 12472 14358 : [284847.824886] ffff8101b9cc1c50 0000000000000086 ffffffffa0131e0a 0000000000000002 : [284847.824889] ffff8102e7454300 ffff810324c6cad0 ffff8102e7454588 0000000000000000 : [284847.824893] 0000000000000001 0000000000000296 0000000000000003 ffff8101b9cc1c58 : [284847.824896] Call Trace: : [284847.828403] [<ffffffffa0131e0a>] :ext3:__ext3_journal_dirty_metadata+0x1e/0x46 : [284847.828412] [<ffffffff80429b0d>] __mutex_lock_slowpath+0x64/0x9b : [284847.828418] [<ffffffff80429972>] mutex_lock+0xa/0xb : [284847.828421] [<ffffffff802a20b9>] do_lookup+0x82/0x1c1 : [284847.828427] [<ffffffff802a4271>] __link_path_walk+0x87a/0xd19 : [284847.828428] [<ffffffff80271296>] find_lock_page+0x1f/0x8a : [284847.828428] [<ffffffff80273182>] filemap_fault+0x1c2/0x33c : [284847.828428] [<ffffffff802a4756>] path_walk+0x46/0x8b : [284847.828428] [<ffffffff802a4a82>] do_path_lookup+0x158/0x1cf : [284847.828428] [<ffffffff802a3879>] getname+0x140/0x1a7 : [284847.828428] [<ffffffff802a53f1>] __user_walk_fd+0x37/0x4c : [284847.828428] [<ffffffff8029e381>] vfs_lstat_fd+0x18/0x47 : [284847.828428] [<ffffffff8029e3c9>] sys_newlstat+0x19/0x31 : [284847.828428] [<ffffffff8020beda>] system_call_after_swapgs+0x8a/0x8f kernel: [1912668.466347] INFO: task apache2:17984 blocked for more than 120 seconds. [1912668.507035] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. : [1912668.555165] apache2 D ffff8101c5637ba0 0 17984 17282 : [1912668.596752] ffff810166a7dd30 0000000000000086 0000000000000000 ffff810166a7dcd8 : [1912668.643341] ffff8101c563c880 ffff81024505f000 0000000000000002 ffff810166a7dd68 : [1912668.699566] 0000000000000086 00000000000cb1a0 0000000000000000 ffff81017f344d60 : [1912668.744773] Call Trace: : [1912668.761754] [<ffffffff8022a3ed>] pick_next_task_fair+0x6e/0x7a : [1912668.829311] [<ffffffff802be0e2>] bio_alloc_bioset+0x89/0xd9 : [1912668.861930] [<ffffffff8024ac3a>] getnstimeofday+0x39/0x98 : [1912668.897005] [<ffffffff802710f6>] sync_page+0x0/0x41 : [1912668.927868] [<ffffffff80429487>] io_schedule+0x5c/0x9e : [1912668.960286] [<ffffffff80271132>] sync_page+0x3c/0x41 : [1912668.991756] [<ffffffff804295fa>] __wait_on_bit_lock+0x36/0x66 : [1912669.031757] [<ffffffff802710e3>] __lock_page+0x5e/0x64 : [1912669.064191] [<ffffffff802461d3>] wake_bit_function+0x0/0x23 : [1912669.100100] [<ffffffff80281bc5>] handle_mm_fault+0x5e4/0x8de : [1912669.134531] [<ffffffff802461a5>] autoremove_wake_function+0x0/0x2e : [1912669.174623] [<ffffffff802aa108>] fcntl_setlk+0x1cf/0x291 : [1912669.210623] [<ffffffff802461a5>] autoremove_wake_function+0x0/0x2e : [1912669.246923] [<ffffffff802a677f>] sys_fcntl+0x280/0x2f7 After googling for "mutex_lock_slowpath" I can only find the Kernel mailing list discussions that this issue was introduced in some commit. Wthout reference to verison. Discussions as recent as Jan 25, 2011. The Kernel I am using is form Debian Lenny, year ago. What should I do? Is this bug even fixed in kernel? if it's such obvious bug why it happens so rarely? Should I download latest kernel from kernel.org and upgrade? Should I use Debian backports to install new "Approved" kernel? Am I missing something? What to do?

    Read the article

  • T-SQL Tuesday #33: Trick Shots: Undocumented, Underdocumented, and Unknown Conspiracies!

    - by Most Valuable Yak (Rob Volk)
    Mike Fal (b | t) is hosting this month's T-SQL Tuesday on Trick Shots.  I love this choice because I've been preoccupied with sneaky/tricky/evil SQL Server stuff for a long time and have been presenting on it for the past year.  Mike's directives were "Show us a cool trick or process you developed…It doesn’t have to be useful", which most of my blogging definitely fits, and "Tell us what you learned from this trick…tell us how it gave you insight in to how SQL Server works", which is definitely a new concept.  I've done a lot of reading and watching on SQL Server Internals and even attended training, but sometimes I need to go explore on my own, using my own tools and techniques.  It's an itch I get every few months, and, well, it sure beats workin'. I've found some people to be intimidated by SQL Server's internals, and I'll admit there are A LOT of internals to keep track of, but there are tons of excellent resources that clearly document most of them, and show how knowing even the basics of internals can dramatically improve your database's performance.  It may seem like rocket science, or even brain surgery, but you don't have to be a genius to understand it. Although being an "evil genius" can help you learn some things they haven't told you about. ;) This blog post isn't a traditional "deep dive" into internals, it's more of an approach to find out how a program works.  It utilizes an extremely handy tool from an even more extremely handy suite of tools, Sysinternals.  I'm not the only one who finds Sysinternals useful for SQL Server: Argenis Fernandez (b | t), Microsoft employee and former T-SQL Tuesday host, has an excellent presentation on how to troubleshoot SQL Server using Sysinternals, and I highly recommend it.  Argenis didn't cover the Strings.exe utility, but I'll be using it to "hack" the SQL Server executable (DLL and EXE) files. Please note that I'm not promoting software piracy or applying these techniques to attack SQL Server via internal knowledge. This is strictly educational and doesn't reveal any proprietary Microsoft information.  And since Argenis works for Microsoft and demonstrated Sysinternals with SQL Server, I'll just let him take the blame for it. :P (The truth is I've used Strings.exe on SQL Server before I ever met Argenis.) Once you download and install Strings.exe you can run it from the command line.  For our purposes we'll want to run this in the Binn folder of your SQL Server instance (I'm referencing SQL Server 2012 RTM): cd "C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn" C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn> strings *sql*.dll > sqldll.txt C:\Program Files\Microsoft SQL Server\MSSQL11\MSSQL\Binn> strings *sql*.exe > sqlexe.txt   I've limited myself to DLLs and EXEs that have "sql" in their names.  There are quite a few more but I haven't examined them in any detail. (Homework assignment for you!) If you run this yourself you'll get 2 text files, one with all the extracted strings from every SQL DLL file, and the other with the SQL EXE strings.  You can open these in Notepad, but you're better off using Notepad++, EditPad, Emacs, Vim or another more powerful text editor, as these will be several megabytes in size. And when you do open it…you'll find…a TON of gibberish.  (If you think that's bad, just try opening the raw DLL or EXE file in Notepad.  And by the way, don't do this in production, or even on a running instance of SQL Server.)  Even if you don't clean up the file, you can still use your editor's search function to find a keyword like "SELECT" or some other item you expect to be there.  As dumb as this sounds, I sometimes spend my lunch break just scanning the raw text for anything interesting.  I'm boring like that. Sometimes though, having these files available can lead to some incredible learning experiences.  For me the most recent time was after reading Joe Sack's post on non-parallel plan reasons.  He mentions a new SQL Server 2012 execution plan element called NonParallelPlanReason, and demonstrates a query that generates "MaxDOPSetToOne".  Joe (formerly on the Microsoft SQL Server product team, so he knows this stuff) mentioned that this new element was not currently documented and tried a few more examples to see what other reasons could be generated. Since I'd already run Strings.exe on the SQL Server DLLs and EXE files, it was easy to run grep/find/findstr for MaxDOPSetToOne on those extracts.  Once I found which files it belonged to (sqlmin.dll) I opened the text to see if the other reasons were listed.  As you can see in my comment on Joe's blog, there were about 20 additional non-parallel reasons.  And while it's not "documentation" of this underdocumented feature, the names are pretty self-explanatory about what can prevent parallel processing. I especially like the ones about cursors – more ammo! - and am curious about the PDW compilation and Cloud DB replication reasons. One reason completely stumped me: NoParallelHekatonPlan.  What the heck is a hekaton?  Google and Wikipedia were vague, and the top results were not in English.  I found one reference to Greek, stating "hekaton" can be translated as "hundredfold"; with a little more Wikipedia-ing this leads to hecto, the prefix for "one hundred" as a unit of measure.  I'm not sure why Microsoft chose hekaton for such a plan name, but having already learned some Greek I figured I might as well dig some more in the DLL text for hekaton.  Here's what I found: hekaton_slow_param_passing Occurs when a Hekaton procedure call dispatch goes to slow parameter passing code path The reason why Hekaton parameter passing code took the slow code path hekaton_slow_param_pass_reason sp_deploy_hekaton_database sp_undeploy_hekaton_database sp_drop_hekaton_database sp_checkpoint_hekaton_database sp_restore_hekaton_database e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\hkproc.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\matgen.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\matquery.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\sqlmeta.cpp e:\sql11_main_t\sql\ntdbms\hekaton\sqlhost\sqllang\resultset.cpp Interesting!  The first 4 entries (in red) mention parameters and "slow code".  Could this be the foundation of the mythical DBCC RUNFASTER command?  Have I been passing my parameters the slow way all this time? And what about those sp_xxxx_hekaton_database procedures (in blue)? Could THEY be the secret to a faster SQL Server? Could they promise a "hundredfold" improvement in performance?  Are these special, super-undocumented DIB (databases in black)? I decided to look in the SQL Server system views for any objects with hekaton in the name, or references to them, in hopes of discovering some new code that would answer all my questions: SELECT name FROM sys.all_objects WHERE name LIKE '%hekaton%' SELECT name FROM sys.all_objects WHERE object_definition(OBJECT_ID) LIKE '%hekaton%' Which revealed: name ------------------------ (0 row(s) affected) name ------------------------ sp_createstats sp_recompile sp_updatestats (3 row(s) affected)   Hmm.  Well that didn't find much.  Looks like these procedures are seriously undocumented, unknown, perhaps forbidden knowledge. Maybe a part of some unspeakable evil? (No, I'm not paranoid, I just like mysteries and thought that punching this up with that kind of thing might keep you reading.  I know I'd fall asleep without it.) OK, so let's check out those 3 procedures and see what they reveal when I search for "Hekaton": sp_createstats: -- filter out local temp tables, Hekaton tables, and tables for which current user has no permissions -- Note that OBJECTPROPERTY returns NULL on type="IT" tables, thus we only call it on type='U' tables   OK, that's interesting, let's go looking down a little further: ((@table_type<>'U') or (0 = OBJECTPROPERTY(@table_id, 'TableIsInMemory'))) and -- Hekaton table   Wellllll, that tells us a few new things: There's such a thing as Hekaton tables (UPDATE: I'm not the only one to have found them!) They are not standard user tables and probably not in memory UPDATE: I misinterpreted this because I didn't read all the code when I wrote this blog post. The OBJECTPROPERTY function has an undocumented TableIsInMemory option Let's check out sp_recompile: -- (3) Must not be a Hekaton procedure.   And once again go a little further: if (ObjectProperty(@objid, 'IsExecuted') <> 0 AND ObjectProperty(@objid, 'IsInlineFunction') = 0 AND ObjectProperty(@objid, 'IsView') = 0 AND -- Hekaton procedure cannot be recompiled -- Make them go through schema version bumping branch, which will fail ObjectProperty(@objid, 'ExecIsCompiledProc') = 0)   And now we learn that hekaton procedures also exist, they can't be recompiled, there's a "schema version bumping branch" somewhere, and OBJECTPROPERTY has another undocumented option, ExecIsCompiledProc.  (If you experiment with this you'll find this option returns null, I think it only works when called from a system object.) This is neat! Sadly sp_updatestats doesn't reveal anything new, the comments about hekaton are the same as sp_createstats.  But we've ALSO discovered undocumented features for the OBJECTPROPERTY function, which we can now search for: SELECT name, object_definition(OBJECT_ID) FROM sys.all_objects WHERE object_definition(OBJECT_ID) LIKE '%OBJECTPROPERTY(%'   I'll leave that to you as more homework.  I should add that searching the system procedures was recommended long ago by the late, great Ken Henderson, in his Guru's Guide books, as a great way to find undocumented features.  That seems to be really good advice! Now if you're a programmer/hacker, you've probably been drooling over the last 5 entries for hekaton (in green), because these are the names of source code files for SQL Server!  Does this mean we can access the source code for SQL Server?  As The Oracle suggested to Neo, can we return to The Source??? Actually, no. Well, maybe a little bit.  While you won't get the actual source code from the compiled DLL and EXE files, you'll get references to source files, debugging symbols, variables and module names, error messages, and even the startup flags for SQL Server.  And if you search for "DBCC" or "CHECKDB" you'll find a really nice section listing all the DBCC commands, including the undocumented ones.  Granted those are pretty easy to find online, but you may be surprised what those web sites DIDN'T tell you! (And neither will I, go look for yourself!)  And as we saw earlier, you'll also find execution plan elements, query processing rules, and who knows what else.  It's also instructive to see how Microsoft organizes their source directories, how various components (storage engine, query processor, Full Text, AlwaysOn/HADR) are split into smaller modules. There are over 2000 source file references, go do some exploring! So what did we learn?  We can pull strings out of executable files, search them for known items, browse them for unknown items, and use the results to examine internal code to learn even more things about SQL Server.  We've even learned how to use command-line utilities!  We are now 1337 h4X0rz!  (Not really.  I hate that leetspeak crap.) Although, I must confess I might've gone too far with the "conspiracy" part of this post.  I apologize for that, it's just my overactive imagination.  There's really no hidden agenda or conspiracy regarding SQL Server internals.  It's not The Matrix.  It's not like you'd find anything like that in there: Attach Matrix Database DM_MATRIX_COMM_PIPELINES MATRIXXACTPARTICIPANTS dm_matrix_agents   Alright, enough of this paranoid ranting!  Microsoft are not really evil!  It's not like they're The Borg from Star Trek: ALTER FEDERATION DROP ALTER FEDERATION SPLIT DROP FEDERATION   #tsql2sday

    Read the article

< Previous Page | 40 41 42 43 44 45 46 47 48 49 50 51  | Next Page >