Search Results

Search found 2429 results on 98 pages for 'monitoring'.

Page 98/98 | < Previous Page | 94 95 96 97 98 

  • Diving into OpenStack Network Architecture - Part 2 - Basic Use Cases

    - by Ronen Kofman
      rkofman Normal rkofman 4 138 2014-06-05T03:38:00Z 2014-06-05T05:04:00Z 3 2735 15596 Oracle Corporation 129 36 18295 12.00 Clean Clean false false false false EN-US X-NONE HE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:Arial; mso-bidi-theme-font:minor-bidi; mso-bidi-language:AR-SA;} In the previous post we reviewed several network components including Open vSwitch, Network Namespaces, Linux Bridges and veth pairs. In this post we will take three simple use cases and see how those basic components come together to create a complete SDN solution in OpenStack. With those three use cases we will review almost the entire network setup and see how all the pieces work together. The use cases we will use are: 1.       Create network – what happens when we create network and how can we create multiple isolated networks 2.       Launch a VM – once we have networks we can launch VMs and connect them to networks. 3.       DHCP request from a VM – OpenStack can automatically assign IP addresses to VMs. This is done through local DHCP service controlled by OpenStack Neutron. We will see how this service runs and how does a DHCP request and response look like. In this post we will show connectivity, we will see how packets get from point A to point B. We first focus on how a configured deployment looks like and only later we will discuss how and when the configuration is created. Personally I found it very valuable to see the actual interfaces and how they connect to each other through examples and hands on experiments. After the end game is clear and we know how the connectivity works, in a later post, we will take a step back and explain how Neutron configures the components to be able to provide such connectivity.  We are going to get pretty technical shortly and I recommend trying these examples on your own deployment or using the Oracle OpenStack Tech Preview. Understanding these three use cases thoroughly and how to look at them will be very helpful when trying to debug a deployment in case something does not work. Use case #1: Create Network Create network is a simple operation it can be performed from the GUI or command line. When we create a network in OpenStack the network is only available to the tenant who created it or it could be defined as “shared” and then it can be used by all tenants. A network can have multiple subnets but for this demonstration purpose and for simplicity we will assume that each network has exactly one subnet. Creating a network from the command line will look like this: # neutron net-create net1 Created a new network: +---------------------------+--------------------------------------+ | Field                     | Value                                | +---------------------------+--------------------------------------+ | admin_state_up            | True                                 | | id                        | 5f833617-6179-4797-b7c0-7d420d84040c | | name                      | net1                                 | | provider:network_type     | vlan                                 | | provider:physical_network | default                              | | provider:segmentation_id  | 1000                                 | | shared                    | False                                | | status                    | ACTIVE                               | | subnets                   |                                      | | tenant_id                 | 9796e5145ee546508939cd49ad59d51f     | +---------------------------+--------------------------------------+ Creating a subnet for this network will look like this: # neutron subnet-create net1 10.10.10.0/24 Created a new subnet: +------------------+------------------------------------------------+ | Field            | Value                                          | +------------------+------------------------------------------------+ | allocation_pools | {"start": "10.10.10.2", "end": "10.10.10.254"} | | cidr             | 10.10.10.0/24                                  | | dns_nameservers  |                                                | | enable_dhcp      | True                                           | | gateway_ip       | 10.10.10.1                                     | | host_routes      |                                                | | id               | 2d7a0a58-0674-439a-ad23-d6471aaae9bc           | | ip_version       | 4                                              | | name             |                                                | | network_id       | 5f833617-6179-4797-b7c0-7d420d84040c           | | tenant_id        | 9796e5145ee546508939cd49ad59d51f               | +------------------+------------------------------------------------+ We now have a network and a subnet, on the network topology view this looks like this: Now let’s dive in and see what happened under the hood. Looking at the control node we will discover that a new namespace was created: # ip netns list qdhcp-5f833617-6179-4797-b7c0-7d420d84040c   The name of the namespace is qdhcp-<network id> (see above), let’s look into the namespace and see what’s in it: # ip netns exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00     inet 127.0.0.1/8 scope host lo     inet6 ::1/128 scope host        valid_lft forever preferred_lft forever 12: tap26c9b807-7c: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN     link/ether fa:16:3e:1d:5c:81 brd ff:ff:ff:ff:ff:ff     inet 10.10.10.3/24 brd 10.10.10.255 scope global tap26c9b807-7c     inet6 fe80::f816:3eff:fe1d:5c81/64 scope link        valid_lft forever preferred_lft forever   We see two interfaces in the namespace, one is the loopback and the other one is an interface called “tap26c9b807-7c”. This interface has the IP address of 10.10.10.3 and it will also serve dhcp requests in a way we will see later. Let’s trace the connectivity of the “tap26c9b807-7c” interface from the namespace.  First stop is OVS, we see that the interface connects to bridge  “br-int” on OVS: # ovs-vsctl show 8a069c7c-ea05-4375-93e2-b9fc9e4b3ca1     Bridge "br-eth2"         Port "br-eth2"             Interface "br-eth2"                 type: internal         Port "eth2"             Interface "eth2"         Port "phy-br-eth2"             Interface "phy-br-eth2"     Bridge br-ex         Port br-ex             Interface br-ex                 type: internal     Bridge br-int         Port "int-br-eth2"             Interface "int-br-eth2"         Port "tap26c9b807-7c"             tag: 1             Interface "tap26c9b807-7c"                 type: internal         Port br-int             Interface br-int                 type: internal     ovs_version: "1.11.0"   In the picture above we have a veth pair which has two ends called “int-br-eth2” and "phy-br-eth2", this veth pair is used to connect two bridge in OVS "br-eth2" and "br-int". In the previous post we explained how to check the veth connectivity using the ethtool command. It shows that the two are indeed a pair: # ethtool -S int-br-eth2 NIC statistics:      peer_ifindex: 10 . .   #ip link . . 10: phy-br-eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 . . Note that “phy-br-eth2” is connected to a bridge called "br-eth2" and one of this bridge's interfaces is the physical link eth2. This means that the network which we have just created has created a namespace which is connected to the physical interface eth2. eth2 is the “VM network” the physical interface where all the virtual machines connect to where all the VMs are connected. About network isolation: OpenStack supports creation of multiple isolated networks and can use several mechanisms to isolate the networks from one another. The isolation mechanism can be VLANs, VxLANs or GRE tunnels, this is configured as part of the initial setup in our deployment we use VLANs. When using VLAN tagging as an isolation mechanism a VLAN tag is allocated by Neutron from a pre-defined VLAN tags pool and assigned to the newly created network. By provisioning VLAN tags to the networks Neutron allows creation of multiple isolated networks on the same physical link.  The big difference between this and other platforms is that the user does not have to deal with allocating and managing VLANs to networks. The VLAN allocation and provisioning is handled by Neutron which keeps track of the VLAN tags, and responsible for allocating and reclaiming VLAN tags. In the example above net1 has the VLAN tag 1000, this means that whenever a VM is created and connected to this network the packets from that VM will have to be tagged with VLAN tag 1000 to go on this particular network. This is true for namespace as well, if we would like to connect a namespace to a particular network we have to make sure that the packets to and from the namespace are correctly tagged when they reach the VM network. In the example above we see that the namespace interface “tap26c9b807-7c” has vlan tag 1 assigned to it, if we examine OVS we see that it has flows which modify VLAN tag 1 to VLAN tag 1000 when a packet goes to the VM network on eth2 and vice versa. We can see this using the dump-flows command on OVS for packets going to the VM network we see the modification done on br-eth2: #  ovs-ofctl dump-flows br-eth2 NXST_FLOW reply (xid=0x4):  cookie=0x0, duration=18669.401s, table=0, n_packets=857, n_bytes=163350, idle_age=25, priority=4,in_port=2,dl_vlan=1 actions=mod_vlan_vid:1000,NORMAL  cookie=0x0, duration=165108.226s, table=0, n_packets=14, n_bytes=1000, idle_age=5343, hard_age=65534, priority=2,in_port=2 actions=drop  cookie=0x0, duration=165109.813s, table=0, n_packets=1671, n_bytes=213304, idle_age=25, hard_age=65534, priority=1 actions=NORMAL   For packets coming from the interface to the namespace we see the following modification: #  ovs-ofctl dump-flows br-int NXST_FLOW reply (xid=0x4):  cookie=0x0, duration=18690.876s, table=0, n_packets=1610, n_bytes=210752, idle_age=1, priority=3,in_port=1,dl_vlan=1000 actions=mod_vlan_vid:1,NORMAL  cookie=0x0, duration=165130.01s, table=0, n_packets=75, n_bytes=3686, idle_age=4212, hard_age=65534, priority=2,in_port=1 actions=drop  cookie=0x0, duration=165131.96s, table=0, n_packets=863, n_bytes=160727, idle_age=1, hard_age=65534, priority=1 actions=NORMAL   To summarize we can see that when a user creates a network Neutron creates a namespace and this namespace is connected through OVS to the “VM network”. OVS also takes care of tagging the packets from the namespace to the VM network with the correct VLAN tag and knows to modify the VLAN for packets coming from VM network to the namespace. Now let’s see what happens when a VM is launched and how it is connected to the “VM network”. Use case #2: Launch a VM Launching a VM can be done from Horizon or from the command line this is how we do it from Horizon: Attach the network: And Launch Once the virtual machine is up and running we can see the associated IP using the nova list command : # nova list +--------------------------------------+--------------+--------+------------+-------------+-----------------+ | ID                                   | Name         | Status | Task State | Power State | Networks        | +--------------------------------------+--------------+--------+------------+-------------+-----------------+ | 3707ac87-4f5d-4349-b7ed-3a673f55e5e1 | Oracle Linux | ACTIVE | None       | Running     | net1=10.10.10.2 | +--------------------------------------+--------------+--------+------------+-------------+-----------------+ The nova list command shows us that the VM is running and that the IP 10.10.10.2 is assigned to this VM. Let’s trace the connectivity from the VM to VM network on eth2 starting with the VM definition file. The configuration files of the VM including the virtual disk(s), in case of ephemeral storage, are stored on the compute node at/var/lib/nova/instances/<instance-id>/. Looking into the VM definition file ,libvirt.xml,  we see that the VM is connected to an interface called “tap53903a95-82” which is connected to a Linux bridge called “qbr53903a95-82”: <interface type="bridge">       <mac address="fa:16:3e:fe:c7:87"/>       <source bridge="qbr53903a95-82"/>       <target dev="tap53903a95-82"/>     </interface>   Looking at the bridge using the brctl show command we see this: # brctl show bridge name     bridge id               STP enabled     interfaces qbr53903a95-82          8000.7e7f3282b836       no              qvb53903a95-82                                                         tap53903a95-82    The bridge has two interfaces, one connected to the VM (“tap53903a95-82 “) and another one ( “qvb53903a95-82”) connected to “br-int” bridge on OVS: # ovs-vsctl show 83c42f80-77e9-46c8-8560-7697d76de51c     Bridge "br-eth2"         Port "br-eth2"             Interface "br-eth2"                 type: internal         Port "eth2"             Interface "eth2"         Port "phy-br-eth2"             Interface "phy-br-eth2"     Bridge br-int         Port br-int             Interface br-int                 type: internal         Port "int-br-eth2"             Interface "int-br-eth2"         Port "qvo53903a95-82"             tag: 3             Interface "qvo53903a95-82"     ovs_version: "1.11.0"   As we showed earlier “br-int” is connected to “br-eth2” on OVS using the veth pair int-br-eth2,phy-br-eth2 and br-eth2 is connected to the physical interface eth2. The whole flow end to end looks like this: VM è tap53903a95-82 (virtual interface)è qbr53903a95-82 (Linux bridge) è qvb53903a95-82 (interface connected from Linux bridge to OVS bridge br-int) è int-br-eth2 (veth one end) è phy-br-eth2 (veth the other end) è eth2 physical interface. The purpose of the Linux Bridge connecting to the VM is to allow security group enforcement with iptables. Security groups are enforced at the edge point which are the interface of the VM, since iptables nnot be applied to OVS bridges we use Linux bridge to apply them. In the future we hope to see this Linux Bridge going away rules.  VLAN tags: As we discussed in the first use case net1 is using VLAN tag 1000, looking at OVS above we see that qvo41f1ebcf-7c is tagged with VLAN tag 3. The modification from VLAN tag 3 to 1000 as we go to the physical network is done by OVS  as part of the packet flow of br-eth2 in the same way we showed before. To summarize, when a VM is launched it is connected to the VM network through a chain of elements as described here. During the packet from VM to the network and back the VLAN tag is modified. Use case #3: Serving a DHCP request coming from the virtual machine In the previous use cases we have shown that both the namespace called dhcp-<some id> and the VM end up connecting to the physical interface eth2  on their respective nodes, both will tag their packets with VLAN tag 1000.We saw that the namespace has an interface with IP of 10.10.10.3. Since the VM and the namespace are connected to each other and have interfaces on the same subnet they can ping each other, in this picture we see a ping from the VM which was assigned 10.10.10.2 to the namespace: The fact that they are connected and can ping each other can become very handy when something doesn’t work right and we need to isolate the problem. In such case knowing that we should be able to ping from the VM to the namespace and back can be used to trace the disconnect using tcpdump or other monitoring tools. To serve DHCP requests coming from VMs on the network Neutron uses a Linux tool called “dnsmasq”,this is a lightweight DNS and DHCP service you can read more about it here. If we look at the dnsmasq on the control node with the ps command we see this: dnsmasq --no-hosts --no-resolv --strict-order --bind-interfaces --interface=tap26c9b807-7c --except-interface=lo --pid-file=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host --dhcp-optsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/opts --leasefile-ro --dhcp-range=tag0,10.10.10.0,static,120s --dhcp-lease-max=256 --conf-file= --domain=openstacklocal The service connects to the tap interface in the namespace (“--interface=tap26c9b807-7c”), If we look at the hosts file we see this: # cat  /var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host fa:16:3e:fe:c7:87,host-10-10-10-2.openstacklocal,10.10.10.2   If you look at the console output above you can see the MAC address fa:16:3e:fe:c7:87 which is the VM MAC. This MAC address is mapped to IP 10.10.10.2 and so when a DHCP request comes with this MAC dnsmasq will return the 10.10.10.2.If we look into the namespace at the time we initiate a DHCP request from the VM (this can be done by simply restarting the network service in the VM) we see the following: # ip netns exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c tcpdump -n 19:27:12.191280 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:fe:c7:87, length 310 19:27:12.191666 IP 10.10.10.3.bootps > 10.10.10.2.bootpc: BOOTP/DHCP, Reply, length 325   To summarize, the DHCP service is handled by dnsmasq which is configured by Neutron to listen to the interface in the DHCP namespace. Neutron also configures dnsmasq with the combination of MAC and IP so when a DHCP request comes along it will receive the assigned IP. Summary In this post we relied on the components described in the previous post and saw how network connectivity is achieved using three simple use cases. These use cases gave a good view of the entire network stack and helped understand how an end to end connection is being made between a VM on a compute node and the DHCP namespace on the control node. One conclusion we can draw from what we saw here is that if we launch a VM and it is able to perform a DHCP request and receive a correct IP then there is reason to believe that the network is working as expected. We saw that a packet has to travel through a long list of components before reaching its destination and if it has done so successfully this means that many components are functioning properly. In the next post we will look at some more sophisticated services Neutron supports and see how they work. We will see that while there are some more components involved for the most part the concepts are the same. @RonenKofman

    Read the article

  • FreeBSD performance tuning. Sysctls, loader.conf, kernel

    - by SaveTheRbtz
    I wanted to share knowledge of tuning FreeBSD via sysctl.conf/loader.conf/KENCONF. It was initially based on Igor Sysoev's (author of nginx) presentation about FreeBSD tuning up to 100,000-200,000 active connections. Tunings are for FreeBSD-CURRENT. Since 7.2 amd64 some of them are tuned well by default. Prior 7.0 some of them are boot only (set via /boot/loader.conf) or does not exist at all. sysctl.conf: # No zero mapping feature # May break wine # (There are also reports about broken samba3) #security.bsd.map_at_zero=0 # If you have really busy webserver with apache13 you may run out of processes #kern.maxproc=10000 # Same for servers with apache2 / Pound #kern.threads.max_threads_per_proc=4096 # Max. backlog size kern.ipc.somaxconn=4096 # Shared memory // 7.2+ can use shared memory > 2Gb kern.ipc.shmmax=2147483648 # Sockets kern.ipc.maxsockets=204800 # Can cause this on older kernels: # http://old.nabble.com/Significant-performance-regression-for-increased-maxsockbuf-on-8.0-RELEASE-tt26745981.html#a26745981 ) kern.ipc.maxsockbuf=10485760 # Mbuf 2k clusters (on amd64 7.2+ 25600 is default) # For such high value vm.kmem_size must be increased to 3G kern.ipc.nmbclusters=262144 # Jumbo pagesize(_SC_PAGESIZE) clusters # Used as general packet storage for jumbo frames # can be monitored via `netstat -m` #kern.ipc.nmbjumbop=262144 # Jumbo 9k/16k clusters # If you are using them #kern.ipc.nmbjumbo9=65536 #kern.ipc.nmbjumbo16=32768 # For lower latency you can decrease scheduler's maximum time slice # default: stathz/10 (~ 13) #kern.sched.slice=1 # Increase max command-line length showed in `ps` (e.g for Tomcat/Java) # Default is PAGE_SIZE / 16 or 256 on x86 # This avoids commands to be presented as [executable] in `ps` # For more info see: http://www.freebsd.org/cgi/query-pr.cgi?pr=120749 kern.ps_arg_cache_limit=4096 # Every socket is a file, so increase them kern.maxfiles=204800 kern.maxfilesperproc=200000 kern.maxvnodes=200000 # On some systems HPET is almost 2 times faster than default ACPI-fast # Useful on systems with lots of clock_gettime / gettimeofday calls # See http://old.nabble.com/ACPI-fast-default-timecounter,-but-HPET-83--faster-td23248172.html # After revision 222222 HPET became default: http://svnweb.freebsd.org/base?view=revision&revision=222222 kern.timecounter.hardware=HPET # Small receive space, only usable on http-server, on file server this # should be increased to 65535 or even more #net.inet.tcp.recvspace=8192 # This is useful on Fat-Long-Pipes #net.inet.tcp.recvbuf_max=10485760 #net.inet.tcp.recvbuf_inc=65535 # Small send space is useful for http servers that serve small files # Autotuned since 7.x net.inet.tcp.sendspace=16384 # This is useful on Fat-Long-Pipes #net.inet.tcp.sendbuf_max=10485760 #net.inet.tcp.sendbuf_inc=65535 # Turn off receive autotuning # You can play with it. #net.inet.tcp.recvbuf_auto=0 #net.inet.tcp.sendbuf_auto=0 # This should be enabled if you going to use big spaces (>64k) # Also timestamp field is useful when using syncookies net.inet.tcp.rfc1323=1 # Turn this off on high-speed, lossless connections (LAN 1Gbit+) # If you set it there is no need in TCP_NODELAY sockopt (see man tcp) net.inet.tcp.delayed_ack=0 # This feature is useful if you are serving data over modems, Gigabit Ethernet, # or even high speed WAN links (or any other link with a high bandwidth delay product), # especially if you are also using window scaling or have configured a large send window. # Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 ) # This sysctl was removed in 10-CURRENT: # See: http://www.mail-archive.com/[email protected]/msg06178.html #net.inet.tcp.inflight.enable=0 # TCP slowstart algorithm tunings # We assuming we have very fast clients #net.inet.tcp.slowstart_flightsize=100 #net.inet.tcp.local_slowstart_flightsize=100 # Disable randomizing of ports to avoid false RST # Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf # (it's also says that port randomization auto-disables at some conn.rates, but I didn't checked it thou) #net.inet.ip.portrange.randomized=0 # Increase portrange # For outgoing connections only. Good for seed-boxes and ftp servers. net.inet.ip.portrange.first=1024 net.inet.ip.portrange.last=65535 # # stops route cache degregation during a high-bandwidth flood # http://www.freebsd.org/doc/en/books/handbook/securing-freebsd.html #net.inet.ip.rtexpire=2 net.inet.ip.rtminexpire=2 net.inet.ip.rtmaxcache=1024 # Security net.inet.ip.redirect=0 net.inet.ip.sourceroute=0 net.inet.ip.accept_sourceroute=0 net.inet.icmp.maskrepl=0 net.inet.icmp.log_redirect=0 net.inet.icmp.drop_redirect=1 net.inet.tcp.drop_synfin=1 # # There is also good example of sysctl.conf with comments: # http://www.thern.org/projects/sysctl.conf # # icmp may NOT rst, helpful for those pesky spoofed # icmp/udp floods that end up taking up your outgoing # bandwidth/ifqueue due to all that outgoing RST traffic. # #net.inet.tcp.icmp_may_rst=0 # Security net.inet.udp.blackhole=1 net.inet.tcp.blackhole=2 # IPv6 Security # For more info see http://www.fosslc.org/drupal/content/security-implications-ipv6 # Disable Node info replies # To see this vulnerability in action run `ping6 -a sglAac ::1` or `ping6 -w ::1` on unprotected node net.inet6.icmp6.nodeinfo=0 # Turn on IPv6 privacy extensions # For more info see proposal http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2008-06/msg00103.html net.inet6.ip6.use_tempaddr=1 net.inet6.ip6.prefer_tempaddr=1 # Disable ICMP redirect net.inet6.icmp6.rediraccept=0 # Disable acceptation of RA and auto linklocal generation if you don't use them #net.inet6.ip6.accept_rtadv=0 #net.inet6.ip6.auto_linklocal=0 # Increases default TTL, sometimes useful # Default is 64 net.inet.ip.ttl=128 # Lessen max segment life to conserve resources # ACK waiting time in miliseconds # (default: 30000. RFC from 1979 recommends 120000) net.inet.tcp.msl=5000 # Max bumber of timewait sockets net.inet.tcp.maxtcptw=200000 # Don't use tw on local connections # As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization. # So disable it or now till get fixed #net.inet.tcp.nolocaltimewait=1 # FIN_WAIT_2 state fast recycle net.inet.tcp.fast_finwait2_recycle=1 # Time before tcp keepalive probe is sent # default is 2 hours (7200000) #net.inet.tcp.keepidle=60000 # Should be increased until net.inet.ip.intr_queue_drops is zero net.inet.ip.intr_queue_maxlen=4096 # Interrupt handling via multiple CPU, but with context switch. # You can play with it. Default is 1; #net.isr.direct=0 # This is for routers only #net.inet.ip.forwarding=1 #net.inet.ip.fastforwarding=1 # This speed ups dummynet when channel isn't saturated net.inet.ip.dummynet.io_fast=1 # Increase dummynet(4) hash #net.inet.ip.dummynet.hash_size=2048 #net.inet.ip.dummynet.max_chain_len # Should be increased when you have A LOT of files on server # (Increase until vfs.ufs.dirhash_mem becomes lower) vfs.ufs.dirhash_maxmem=67108864 # Note from commit http://svn.freebsd.org/base/head@211031 : # For systems with RAID volumes and/or virtualization envirnments, where # read performance is very important, increasing this sysctl tunable to 32 # or even more will demonstratively yield additional performance benefits. vfs.read_max=32 # Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification) net.inet.tcp.ecn.enable=1 # Flowtable - flow caching mechanism # Useful for routers #net.inet.flowtable.enable=1 #net.inet.flowtable.nmbflows=65535 # Extreme polling tuning #kern.polling.burst_max=1000 #kern.polling.each_burst=1000 #kern.polling.reg_frac=100 #kern.polling.user_frac=1 #kern.polling.idle_poll=0 # IPFW dynamic rules and timeouts tuning # Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower net.inet.ip.fw.dyn_buckets=65536 net.inet.ip.fw.dyn_max=65536 net.inet.ip.fw.dyn_ack_lifetime=120 net.inet.ip.fw.dyn_syn_lifetime=10 net.inet.ip.fw.dyn_fin_lifetime=2 net.inet.ip.fw.dyn_short_lifetime=10 # Make packets pass firewall only once when using dummynet # i.e. packets going thru pipe are passing out from firewall with accept #net.inet.ip.fw.one_pass=1 # shm_use_phys Wires all shared pages, making them unswappable # Use this to lessen Virtual Memory Manager's work when using Shared Mem. # Useful for databases #kern.ipc.shm_use_phys=1 # ZFS # Enable prefetch. Useful for sequential load type i.e fileserver. # FreeBSD sets vfs.zfs.prefetch_disable to 1 on any i386 systems and # on any amd64 systems with less than 4GB of avaiable memory # For additional info check this nabble thread http://old.nabble.com/Samba-read-speed-performance-tuning-td27964534.html #vfs.zfs.prefetch_disable=0 # On highload servers you may notice following message in dmesg: # "Approaching the limit on PV entries, consider increasing either the # vm.pmap.shpgperproc or the vm.pmap.pv_entry_max tunable" vm.pmap.shpgperproc=2048 loader.conf: # Accept filters for data, http and DNS requests # Useful when your software uses select() instead of kevent/kqueue or when you under DDoS # DNS accf available on 8.0+ accf_data_load="YES" accf_http_load="YES" accf_dns_load="YES" # Async IO system calls aio_load="YES" # Linux specific devices in /dev # As for 8.1 it only /dev/full #lindev_load="YES" # Adds NCQ support in FreeBSD # WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+ # 8.0+ only #ahci_load="YES" #siis_load="YES" # FreeBSD 8.2+ # New Congestion Control for FreeBSD # http://caia.swin.edu.au/urp/newtcp/tools/cc_chd-readme-0.1.txt # http://www.ietf.org/proceedings/78/slides/iccrg-5.pdf # Initial merge commit message http://www.mail-archive.com/[email protected]/msg31410.html #cc_chd_load="YES" # Increase kernel memory size to 3G. # # Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM # Otherwise panic will happen on next reboot! # # It's required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc # Useful on highload stateful firewalls, proxies or ZFS fileservers # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #vm.kmem_size="3G" # If your server has lots of swap (>4Gb) you should increase following value # according to http://lists.freebsd.org/pipermail/freebsd-hackers/2009-October/029616.html # Otherwise you'll be getting errors # "kernel: swap zone exhausted, increase kern.maxswzone" # kern.maxswzone="256M" # Older versions of FreeBSD can't tune maxfiles on the fly #kern.maxfiles="200000" # Useful for databases # Sets maximum data size to 1G # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #kern.maxdsiz="1G" # Maximum buffer size(vfs.maxbufspace) # You can check current one via vfs.bufspace # Should be lowered/upped depending on server's load-type # Usually decreased to preserve kmem # (default is 10% of mem) #kern.maxbcache="512M" # Sendfile buffers # For i386 only #kern.ipc.nsfbufs=10240 # FreeBSD 9+ # HPET "legacy route" support. It should allow HPET to work per-CPU # See http://www.mail-archive.com/[email protected]/msg03603.html #hint.atrtc.0.clock=0 #hint.attimer.0.clock=0 #hint.hpet.0.legacy_route=1 # syncache Hash table tuning net.inet.tcp.syncache.hashsize=1024 net.inet.tcp.syncache.bucketlimit=512 net.inet.tcp.syncache.cachelimit=65536 # Increased hostcache # Later host cache can be viewed via net.inet.tcp.hostcache.list hidden sysctl # Very useful for it's RTT RTTVAR # Must be power of two net.inet.tcp.hostcache.hashsize=65536 # hashsize * bucketlimit (which is 30 by default) # It allocates 255Mb (1966080*136) of RAM net.inet.tcp.hostcache.cachelimit=1966080 # TCP control-block Hash table tuning net.inet.tcp.tcbhashsize=4096 # Disable ipfw deny all # Should be uncommented when there is a chance that # kernel and ipfw binary may be out-of sync on next reboot #net.inet.ip.fw.default_to_accept=1 # # SIFTR (Statistical Information For TCP Research) is a kernel module that # logs a range of statistics on active TCP connections to a log file. # See prerelease notes http://groups.google.com/group/mailing.freebsd.current/browse_thread/thread/b4c18be6cdce76e4 # and man 4 sitfr #siftr_load="YES" # Enable superpages, for 7.2+ only # Also read http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html vm.pmap.pg_ps_enabled=1 # Usefull if you are using Intel-Gigabit NIC #hw.em.rxd=4096 #hw.em.txd=4096 #hw.em.rx_process_limit="-1" # Also if you have ALOT interrupts on NIC - play with following parameters # NOTE: You should set them for every NIC #dev.em.0.rx_int_delay: 250 #dev.em.0.tx_int_delay: 250 #dev.em.0.rx_abs_int_delay: 250 #dev.em.0.tx_abs_int_delay: 250 # There is also multithreaded version of em/igb drivers can be found here: # http://people.yandex-team.ru/~wawa/ # # for additional em monitoring and statistics use # sysctl dev.em.0.stats=1 ; dmesg # sysctl dev.em.0.debug=1 ; dmesg # Also after r209242 (-CURRENT) there is a separate sysctl for each stat variable; # Same tunings for igb #hw.igb.rxd=4096 #hw.igb.txd=4096 #hw.igb.rx_process_limit=100 # Some useful netisr tunables. See sysctl net.isr #net.isr.maxthreads=4 #net.isr.defaultqlimit=4096 #net.isr.maxqlimit: 10240 # Bind netisr threads to CPUs #net.isr.bindthreads=1 # # FreeBSD 9.x+ # Increase interface send queue length # See commit message http://svn.freebsd.org/viewvc/base?view=revision&revision=207554 #net.link.ifqmaxlen=1024 # Nicer boot logo =) loader_logo="beastie" And finally here is KERNCONF: # Just some of them, see also # cat /sys/{i386,amd64,}/conf/NOTES # This one useful only on i386 #options KVA_PAGES=512 # You can play with HZ in environments with high interrupt rate (default is 1000) # 100 is for my notebook to prolong it's battery life #options HZ=100 # Polling is goot on network loads with high packet rates and low-end NICs # NB! Do not enable it if you want more than one netisr thread #options DEVICE_POLLING # Eliminate datacopy on socket read-write # To take advantage with zero copy sockets you should have an MTU >= 4k # This req. is only for receiving data. # Read more in man zero_copy_sockets # Also this epic thread on kernel trap: # http://kerneltrap.org/node/6506 # Here Linus says that "anybody that does it that way (FreeBSD) is totally incompetent" #options ZERO_COPY_SOCKETS # Support TCP sign. Used for IPSec options TCP_SIGNATURE # There was stackoverflow found in KAME IPSec stack: # See http://secunia.com/advisories/43995/ # For quick workaround you can use `ipfw add deny proto ipcomp` options IPSEC # This ones can be loaded as modules. They described in loader.conf section #options ACCEPT_FILTER_DATA #options ACCEPT_FILTER_HTTP # Adding ipfw, also can be loaded as modules options IPFIREWALL # On 8.1+ you can disable verbose to see blocked packets on ipfw0 interface. # Also there is no point in compiling verbose into the kernel, because # now there is net.inet.ip.fw.verbose tunable. #options IPFIREWALL_VERBOSE #options IPFIREWALL_VERBOSE_LIMIT=10 options IPFIREWALL_FORWARD # Adding kernel NAT options IPFIREWALL_NAT options LIBALIAS # Traffic shaping options DUMMYNET # Divert, i.e. for userspace NAT options IPDIVERT # This is for OpenBSD's pf firewall device pf device pflog # pf's QoS - ALTQ options ALTQ options ALTQ_CBQ # Class Bases Queuing (CBQ) options ALTQ_RED # Random Early Detection (RED) options ALTQ_RIO # RED In/Out options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC) options ALTQ_PRIQ # Priority Queuing (PRIQ) options ALTQ_NOPCC # Required for SMP build # Pretty console # Manual can be found here http://forums.freebsd.org/showthread.php?t=6134 #options VESA #options SC_PIXEL_MODE # Disable reboot on Ctrl Alt Del #options SC_DISABLE_REBOOT # Change normal|kernel messages color options SC_NORM_ATTR=(FG_GREEN|BG_BLACK) options SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK) # More scroll space options SC_HISTORY_SIZE=8192 # Adding hardware crypto device device crypto device cryptodev # Useful network interfaces device vlan device tap #Virtual Ethernet driver device gre #IP over IP tunneling device if_bridge #Bridge interface device pfsync #synchronization interface for PF device carp #Common Address Redundancy Protocol device enc #IPsec interface device lagg #Link aggregation interface device stf #IPv4-IPv6 port # Also for my notebook, but may be used with Opteron device amdtemp # Same for Intel processors device coretemp # man 4 cpuctl device cpuctl # CPU control pseudo-device # Support for ECMP. More than one route for destination # Works even with default route so one can use it as LB for two ISP # For now code is unstable and panics (panic: rtfree 2) on route deletions. #options RADIX_MPATH # Multicast routing #options MROUTING #options PIM # Debug & DTrace options KDB # Kernel debugger related code options KDB_TRACE # Print a stack trace for a panic options KDTRACE_FRAME # amd64-only(?) options KDTRACE_HOOKS # all architectures - enable general DTrace hooks #options DDB #options DDB_CTF # all architectures - kernel ELF linker loads CTF data # Adaptive spining in lockmgr (8.x+) # See http://www.mail-archive.com/[email protected]/msg10782.html options ADAPTIVE_LOCKMGRS # UTF-8 in console (8.x+) #options TEKEN_UTF8 # FreeBSD 8.1+ # Deadlock resolver thread # For additional information see http://www.mail-archive.com/[email protected]/msg18124.html # (FYI: "resolution" is panic so use with caution) #options DEADLKRES # Increase maximum size of Raw I/O and sendfile(2) readahead #options MAXPHYS=(1024*1024) #options MAXBSIZE=(1024*1024) # For scheduler debug enable following option. # Debug will be available via `kern.sched.stats` sysctl # For more information see http://svnweb.freebsd.org/base/head/sys/conf/NOTES?view=markup #options SCHED_STATS If you are tuning network for maximum performance you may wish to play with ifconfig options like: # You can list all capabilities via `ifconfig -m` ifconfig [-]rxcsum [-]txcsum [-]tso [-]lro mtu In case you've enabled DDB in kernel config, you should edit your /etc/ddb.conf and add something like this to enable automatic reboot (and textdump as bonus): script kdb.enter.panic=textdump set; capture on; show pcpu; bt; ps; alltrace; capture off; call doadump; reset script kdb.enter.default=textdump set; capture on; bt; ps; capture off; call doadump; reset And do not forget to add ddb_enable="YES" to /etc/rc.conf Since FreeBSD 9 you can select to enable/disable flowcontrol on your NIC: # See http://en.wikipedia.org/wiki/Ethernet_flow_control and # http://www.mail-archive.com/[email protected]/msg07927.html for additional info ifconfig bge0 media auto mediaopt flowcontrol PS. Also most of FreeBSD's limits can be monitored by # vmstat -z and # limits PPS. variety of network counters can be monitored via # netstat -s In FreeBSD-9 netstat's -Q option appeared, try following command to display netisr stats # netstat -Q PPPS. also see # man 7 tuning PPPPS. I wanted to thank FreeBSD community, especially author of nginx - Igor Sysoev, nginx-ru@ and FreeBSD-performance@ mailing lists for providing useful information about FreeBSD tuning. FreeBSD WIP * Whats cooking for FreeBSD 7? * Whats cooking for FreeBSD 8? * Whats cooking for FreeBSD 9? So here is the question: What tunings are you using on yours FreeBSD servers? You can also post your /etc/sysctl.conf, /boot/loader.conf, kernel options, etc with description of its' meaning (do not copy-paste from sysctl -d). Don't forget to specify server type (web, smb, gateway, etc) Let's share experience!

    Read the article

  • Slow NFS and GFS2 performance

    - by Tiago
    Recently I've designed and configured a 4 node cluster for a webapp that does lots of file handling. The cluster have been broken down into 2 main roles, webserver and storage. Each role is replicated to a second server using drbd in active/passive mode. The webserver does a NFS mount of the data directory of the storage server and the latter also has a webserver running to serve files to browser clients. In the storage servers I've created a GFS2 FS to hold the data which is wired to drbd. I've chose GFS2 mainly because the announced performance and also because the volume size which has to be pretty high. Since we entered production I've been facing two problems that I think are deeply connected. First of all, the NFS mount on the webservers keeps hanging for a minute or so and then resumes normal operations. By analyzing the logs I've found out that NFS stops answering for a while and outputs the following log lines: Oct 15 18:15:42 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:44 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:46 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:47 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:47 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:47 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:48 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:48 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:51 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:52 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:52 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:55 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:55 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:58 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK In this case, the hang lasted for 16 seconds but sometimes it takes 1 or 2 minutes to resume normal operations. My first guess was this was happening due to heavy load of the NFS mount and that by increasing RPCNFSDCOUNT to a higher value, this would become stable. I've increased it several times and apparently, after a while, the logs started appearing less times. The value is now on 32. After further investigating the issue, I've came across a different hang, despite the NFS messages still appear in the logs. Sometimes, the GFS2 FS simply hangs which causes both the NFS and the storage webserver to serve files. Both stay hang for a while and then they resume normal operations. This hangs leaves no trace on client side (also leaves no NFS ... not responding messages) and, on the storage side, the log system appears to be empty, even though the rsyslogd is running. The nodes connect themselves through a 10Gbps non-dedicated connection but I don't think this is an issue because the GFS2 hang is confirmed but connecting directly to the active storage server. I've been trying to solve this for a while now and I've tried different NFS configuration options, before I've found out the GFS2 FS is also hanging. The NFS mount is exported as such: /srv/data/ <ip_address>(rw,async,no_root_squash,no_all_squash,fsid=25) And the NFS client mounts with: mount -o "async,hard,intr,wsize=8192,rsize=8192" active.storage.vlan:/srv/data /srv/data After some tests, these were the configurations that yielded more performance to the cluster. I am desperate to find a solution for this as the cluster is already in production mode and I need to fix this so that this hangs won't happen in the future and I don't really know for sure what and how I should be benchmarking. What I can tell is that this is happening due to heavy loads as I have tested the cluster earlier and this problems weren't happening at all. Please tell me if you need me to provide configuration details of the cluster, and which do you want me to post. As last resort I can migrate the files to a different FS but I need some solid pointers on whether this will solve this problems as the volume size is extremely large at this point. The servers are being hosted by a third-party enterprise and I don't have physical access to them. Best regards. EDIT 1: The servers are physical servers and their specs are: Webservers: Intel Bi Xeon E5606 2x4 2.13GHz 24GB DDR3 Intel SSD 320 2 x 120GB Raid 1 Storage: Intel i5 3550 3.3GHz 16GB DDR3 12 x 2TB SATA Initially there was a VRack setup between the servers but we've upgraded one of the storage servers to have more RAM and it wasn't inside the VRack. They connect through a shared 10Gbps connection between them. Please note that it is the same connection that is used for public access. They use a single IP (using IP Failover) to connect between them and to allow for a graceful failover. NFS is therefore over a public connection and not under any private network (it was before the upgrade, were the problem still existed). The firewall was configured and tested thoroughly but I disabled it for a while to see if the problem still occurred, and it did. From my knowledge the hosting provider isn't blocking or limiting the connection between either the servers and the public domain (at least under a given bandwidth consumption threshold that hasn't been reached yet). Hope this helps figuring out the problem. EDIT 2: Relevant software versions: CentOS 2.6.32-279.9.1.el6.x86_64 nfs-utils-1.2.3-26.el6.x86_64 nfs-utils-lib-1.1.5-4.el6.x86_64 gfs2-utils-3.0.12.1-32.el6_3.1.x86_64 kmod-drbd84-8.4.2-1.el6_3.elrepo.x86_64 drbd84-utils-8.4.2-1.el6.elrepo.x86_64 DRBD configuration on storage servers: #/etc/drbd.d/storage.res resource storage { protocol C; on <server1 fqdn> { device /dev/drbd0; disk /dev/vg_storage/LV_replicated; address <server1 ip>:7788; meta-disk internal; } on <server2 fqdn> { device /dev/drbd0; disk /dev/vg_storage/LV_replicated; address <server2 ip>:7788; meta-disk internal; } } NFS Configuration in storage servers: #/etc/sysconfig/nfs RPCNFSDCOUNT=32 STATD_PORT=10002 STATD_OUTGOING_PORT=10003 MOUNTD_PORT=10004 RQUOTAD_PORT=10005 LOCKD_UDPPORT=30001 LOCKD_TCPPORT=30001 (can there be any conflict in using the same port for both LOCKD_UDPPORT and LOCKD_TCPPORT?) GFS2 configuration: # gfs2_tool gettune <mountpoint> incore_log_blocks = 1024 log_flush_secs = 60 quota_warn_period = 10 quota_quantum = 60 max_readahead = 262144 complain_secs = 10 statfs_slow = 0 quota_simul_sync = 64 statfs_quantum = 30 quota_scale = 1.0000 (1, 1) new_files_jdata = 0 Storage network environment: eth0 Link encap:Ethernet HWaddr <mac address> inet addr:<ip address> Bcast:<bcast address> Mask:<ip mask> inet6 addr: <ip address> Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:957025127 errors:0 dropped:0 overruns:0 frame:0 TX packets:1473338731 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2630984979622 (2.3 TiB) TX bytes:1648430431523 (1.4 TiB) eth0:0 Link encap:Ethernet HWaddr <mac address> inet addr:<ip failover address> Bcast:<bcast address> Mask:<ip mask> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 The IP addresses are statically assigned with the given network configurations: DEVICE="eth0" BOOTPROTO="static" HWADDR=<mac address> ONBOOT="yes" TYPE="Ethernet" IPADDR=<ip address> NETMASK=<net mask> and DEVICE="eth0:0" BOOTPROTO="static" HWADDR=<mac address> IPADDR=<ip failover> NETMASK=<net mask> ONBOOT="yes" BROADCAST=<bcast address> Hosts file to allow for a graceful NFS failover in conjunction with NFS option fsid=25 set on both storage servers: #/etc/hosts <storage ip failover address> active.storage.vlan <webserver ip failover address> active.service.vlan As you can see, packet errors are down to 0. I've also ran ping for a long time without any packet loss. MTU size is the normal 1500. As there is no VLan by now, this is the MTU used to communicate between servers. The webservers' network environment is similar. One thing I forgot to mention is that the storage servers handle ~200GB of new files each day through the NFS connection, which is a key point for me to think this is some kind of heavy load problem with either NFS or GFS2. If you need further configuration details please tell me. EDIT 3: Earlier today we had a major filesystem crash on the storage server. I couldn't get the details of the crash right away because the server stop responding. After the reboot, I noticed the filesystem was extremely slow, and I was not being able to serve a single file through either NFS or httpd, perhaps due to cache warming or so. Nevertheless, I've been monitoring the server closely and the following error came up in dmesg. The source of the problem is clearly GFS, which is waiting for a lock and ends up starving after a while. INFO: task nfsd:3029 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. nfsd D 0000000000000000 0 3029 2 0x00000080 ffff8803814f79e0 0000000000000046 0000000000000000 ffffffff8109213f ffff880434c5e148 ffff880624508d88 ffff8803814f7960 ffffffffa037253f ffff8803815c1098 ffff8803814f7fd8 000000000000fb88 ffff8803815c1098 Call Trace: [<ffffffff8109213f>] ? wake_up_bit+0x2f/0x40 [<ffffffffa037253f>] ? gfs2_holder_wake+0x1f/0x30 [gfs2] [<ffffffff814ff42e>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff814ff2cb>] mutex_lock+0x2b/0x50 [<ffffffffa0379f21>] gfs2_log_reserve+0x51/0x190 [gfs2] [<ffffffffa0390da2>] gfs2_trans_begin+0x112/0x1d0 [gfs2] [<ffffffffa0369b05>] ? gfs2_dir_check+0x35/0xe0 [gfs2] [<ffffffffa0377943>] gfs2_createi+0x1a3/0xaa0 [gfs2] [<ffffffff8121aab1>] ? avc_has_perm+0x71/0x90 [<ffffffffa0383d1e>] gfs2_create+0x7e/0x1a0 [gfs2] [<ffffffffa037783f>] ? gfs2_createi+0x9f/0xaa0 [gfs2] [<ffffffff81188cf4>] vfs_create+0xb4/0xe0 [<ffffffffa04217d6>] nfsd_create_v3+0x366/0x4c0 [nfsd] [<ffffffffa0429703>] nfsd3_proc_create+0x123/0x1b0 [nfsd] [<ffffffffa041a43e>] nfsd_dispatch+0xfe/0x240 [nfsd] [<ffffffffa025a5d4>] svc_process_common+0x344/0x640 [sunrpc] [<ffffffff810602a0>] ? default_wake_function+0x0/0x20 [<ffffffffa025ac10>] svc_process+0x110/0x160 [sunrpc] [<ffffffffa041ab62>] nfsd+0xc2/0x160 [nfsd] [<ffffffffa041aaa0>] ? nfsd+0x0/0x160 [nfsd] [<ffffffff81091de6>] kthread+0x96/0xa0 [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffff81091d50>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20

    Read the article

  • Using a mounted NTFS share with nginx

    - by Hoff
    I have set up a local testing VM with Ubuntu Server 12.04 LTS and the LEMP stack. It's kind of an unconventional setup because instead of having all my PHP scripts on the local machine, I've mounted an NTFS share as the document root because I do my development on Windows. I had everything working perfectly up until this morning, now I keep getting a dreaded 'File not found.' error. I am almost certain this must be somehow permission related, because if I copy my site over to /var/www, nginx and php-fpm have no problems serving my PHP scripts. What I can't figure out is why all of a sudden (after a reboot of the server), no PHP files will be served but instead just the 'File not found.' error. Static files work fine, so I think it's PHP that is causing the headache. Both nginx and php-fpm are configured to run as the user www-data: root@ubuntu-server:~# ps aux | grep 'nginx\|php-fpm' root 1095 0.0 0.0 5816 792 ? Ss 11:11 0:00 nginx: master process /opt/nginx/sbin/nginx -c /etc/nginx/nginx.conf www-data 1096 0.0 0.1 6016 1172 ? S 11:11 0:00 nginx: worker process www-data 1098 0.0 0.1 6016 1172 ? S 11:11 0:00 nginx: worker process root 1130 0.0 0.4 175560 4212 ? Ss 11:11 0:00 php-fpm: master process (/etc/php5/php-fpm.conf) www-data 1131 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www www-data 1132 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www www-data 1133 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www root 1686 0.0 0.0 4368 816 pts/1 S+ 11:11 0:00 grep --color=auto nginx\|php-fpm I have mounted the NTFS share at /mnt/webfiles by editing /etc/fstab and adding the following line: //192.168.0.199/c$/Websites/ /mnt/webfiles cifs username=Jordan,password=mypasswordhere,gid=33,uid=33 0 0 Where gid 33 is the www-data group and uid 33 is the user www-data. If I list the contents of one of my sites you can in fact see that they belong to the user www-data: root@ubuntu-server:~# ls -l /mnt/webfiles/nTv5-2.0 total 8 drwxr-xr-x 0 www-data www-data 0 Jun 6 19:12 app drwxr-xr-x 0 www-data www-data 0 Aug 22 19:00 assets -rwxr-xr-x 0 www-data www-data 1150 Jan 4 2012 favicon.ico -rwxr-xr-x 0 www-data www-data 1412 Dec 28 2011 index.php drwxr-xr-x 0 www-data www-data 0 Jun 3 16:44 lib drwxr-xr-x 0 www-data www-data 0 Jan 3 2012 plugins drwxr-xr-x 0 www-data www-data 0 Jun 3 16:45 vendors If I switch to the www-data user, I have no problem creating a new file on the share: root@ubuntu-server:~# su www-data $ > /mnt/webfiles/test.txt $ ls -l /mnt/webfiles | grep test\.txt -rwxr-xr-x 0 www-data www-data 0 Sep 8 11:19 test.txt There should be no problem reading or writing to the share with php-fpm running as the user www-data. When I examine the error log of nginx, it's filled with a bunch of lines that look like the following: 2012/09/08 11:22:36 [error] 1096#0: *1 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 192.168.0.199, server: , request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.0.123" 2012/09/08 11:22:39 [error] 1096#0: *1 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 192.168.0.199, server: , request: "GET /apc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.0.123" It's bizarre that this was working previously and now all of sudden PHP is complaining that it can't "find" the scripts on the share. Does anybody know why this is happening? EDIT I tried editing php-fpm.conf and changing chdir to the following: chdir = /mnt/webfiles When I try and restart the php-fpm service, I get the error: Starting php-fpm [08-Sep-2012 14:20:55] ERROR: [pool www] the chdir path '/mnt/webfiles' does not exist or is not a directory This is a total load of bullshit because this directory DOES exist and is mounted! Any ls commands to list that directory work perfectly. Why the hell can't PHP-FPM see this directory?! Here are my configuration files for reference: nginx.conf user www-data; worker_processes 2; error_log /var/log/nginx/nginx.log info; pid /var/run/nginx.pid; events { worker_connections 1024; multi_accept on; } http { include fastcgi.conf; include mime.types; default_type application/octet-stream; set_real_ip_from 127.0.0.1; real_ip_header X-Forwarded-For; ## Proxy proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; client_max_body_size 32m; client_body_buffer_size 128k; proxy_connect_timeout 90; proxy_send_timeout 90; proxy_read_timeout 90; proxy_buffers 32 4k; ## Compression gzip on; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_disable "MSIE [1-6]\.(?!.*SV1)"; ### TCP options tcp_nodelay on; tcp_nopush on; keepalive_timeout 65; sendfile on; include /etc/nginx/sites-enabled/*; } my site config server { listen 80; access_log /var/log/nginx/$host.access.log; error_log /var/log/nginx/error.log; root /mnt/webfiles/nTv5-2.0/app/webroot; index index.php; ## Block bad bots if ($http_user_agent ~* (HTTrack|HTMLParser|libcurl|discobot|Exabot|Casper|kmccrew|plaNETWORK|RPT-HTTPClient)) { return 444; } ## Block certain Referers (case insensitive) if ($http_referer ~* (sex|vigra|viagra) ) { return 444; } ## Deny dot files: location ~ /\. { deny all; } ## Favicon Not Found location = /favicon.ico { access_log off; log_not_found off; } ## Robots.txt Not Found location = /robots.txt { access_log off; log_not_found off; } if (-f $document_root/maintenance.html) { rewrite ^(.*)$ /maintenance.html last; } location ~* \.(?:ico|css|js|gif|jpe?g|png)$ { # Some basic cache-control for static files to be sent to the browser expires max; add_header Pragma public; add_header Cache-Control "max-age=2678400, public, must-revalidate"; } location / { try_files $uri $uri/ index.php; if (-f $request_filename) { break; } rewrite ^(.+)$ /index.php?url=$1 last; } location ~ \.php$ { include /etc/nginx/fastcgi.conf; fastcgi_pass unix:/var/run/php5-fpm.sock; } } php-fpm.conf ;;;;;;;;;;;;;;;;;;;;; ; FPM Configuration ; ;;;;;;;;;;;;;;;;;;;;; ; All relative paths in this configuration file are relative to PHP's install ; prefix (/opt/php5). This prefix can be dynamicaly changed by using the ; '-p' argument from the command line. ; Include one or more files. If glob(3) exists, it is used to include a bunch of ; files from a glob(3) pattern. This directive can be used everywhere in the ; file. ; Relative path can also be used. They will be prefixed by: ; - the global prefix if it's been set (-p arguement) ; - /opt/php5 otherwise ;include=etc/fpm.d/*.conf ;;;;;;;;;;;;;;;;;; ; Global Options ; ;;;;;;;;;;;;;;;;;; [global] ; Pid file ; Note: the default prefix is /opt/php5/var ; Default Value: none pid = /var/run/php-fpm.pid ; Error log file ; Note: the default prefix is /opt/php5/var ; Default Value: log/php-fpm.log error_log = /var/log/php5-fpm/php-fpm.log ; Log level ; Possible Values: alert, error, warning, notice, debug ; Default Value: notice ;log_level = notice ; If this number of child processes exit with SIGSEGV or SIGBUS within the time ; interval set by emergency_restart_interval then FPM will restart. A value ; of '0' means 'Off'. ; Default Value: 0 ;emergency_restart_threshold = 0 ; Interval of time used by emergency_restart_interval to determine when ; a graceful restart will be initiated. This can be useful to work around ; accidental corruptions in an accelerator's shared memory. ; Available Units: s(econds), m(inutes), h(ours), or d(ays) ; Default Unit: seconds ; Default Value: 0 ;emergency_restart_interval = 0 ; Time limit for child processes to wait for a reaction on signals from master. ; Available units: s(econds), m(inutes), h(ours), or d(ays) ; Default Unit: seconds ; Default Value: 0 ;process_control_timeout = 0 ; Send FPM to background. Set to 'no' to keep FPM in foreground for debugging. ; Default Value: yes ;daemonize = yes ;;;;;;;;;;;;;;;;;;;; ; Pool Definitions ; ;;;;;;;;;;;;;;;;;;;; ; Multiple pools of child processes may be started with different listening ; ports and different management options. The name of the pool will be ; used in logs and stats. There is no limitation on the number of pools which ; FPM can handle. Your system will tell you anyway :) ; Start a new pool named 'www'. ; the variable $pool can we used in any directive and will be replaced by the ; pool name ('www' here) [www] ; Per pool prefix ; It only applies on the following directives: ; - 'slowlog' ; - 'listen' (unixsocket) ; - 'chroot' ; - 'chdir' ; - 'php_values' ; - 'php_admin_values' ; When not set, the global prefix (or /opt/php5) applies instead. ; Note: This directive can also be relative to the global prefix. ; Default Value: none ;prefix = /path/to/pools/$pool ; The address on which to accept FastCGI requests. ; Valid syntaxes are: ; 'ip.add.re.ss:port' - to listen on a TCP socket to a specific address on ; a specific port; ; 'port' - to listen on a TCP socket to all addresses on a ; specific port; ; '/path/to/unix/socket' - to listen on a unix socket. ; Note: This value is mandatory. ;listen = 127.0.0.1:9000 listen = /var/run/php5-fpm.sock ; Set listen(2) backlog. A value of '-1' means unlimited. ; Default Value: 128 (-1 on FreeBSD and OpenBSD) ;listen.backlog = -1 ; List of ipv4 addresses of FastCGI clients which are allowed to connect. ; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original ; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address ; must be separated by a comma. If this value is left blank, connections will be ; accepted from any ip address. ; Default Value: any ;listen.allowed_clients = 127.0.0.1 ; Set permissions for unix socket, if one is used. In Linux, read/write ; permissions must be set in order to allow connections from a web server. Many ; BSD-derived systems allow connections regardless of permissions. ; Default Values: user and group are set as the running user ; mode is set to 0666 ;listen.owner = www-data ;listen.group = www-data ;listen.mode = 0666 ; Unix user/group of processes ; Note: The user is mandatory. If the group is not set, the default user's group ; will be used. user = www-data group = www-data ; Choose how the process manager will control the number of child processes. ; Possible Values: ; static - a fixed number (pm.max_children) of child processes; ; dynamic - the number of child processes are set dynamically based on the ; following directives: ; pm.max_children - the maximum number of children that can ; be alive at the same time. ; pm.start_servers - the number of children created on startup. ; pm.min_spare_servers - the minimum number of children in 'idle' ; state (waiting to process). If the number ; of 'idle' processes is less than this ; number then some children will be created. ; pm.max_spare_servers - the maximum number of children in 'idle' ; state (waiting to process). If the number ; of 'idle' processes is greater than this ; number then some children will be killed. ; Note: This value is mandatory. pm = dynamic ; The number of child processes to be created when pm is set to 'static' and the ; maximum number of child processes to be created when pm is set to 'dynamic'. ; This value sets the limit on the number of simultaneous requests that will be ; served. Equivalent to the ApacheMaxClients directive with mpm_prefork. ; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP ; CGI. ; Note: Used when pm is set to either 'static' or 'dynamic' ; Note: This value is mandatory. pm.max_children = 50 ; The number of child processes created on startup. ; Note: Used only when pm is set to 'dynamic' ; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2 pm.start_servers = 20 ; The desired minimum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.min_spare_servers = 5 ; The desired maximum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.max_spare_servers = 35 ; The number of requests each child process should execute before respawning. ; This can be useful to work around memory leaks in 3rd party libraries. For ; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS. ; Default Value: 0 pm.max_requests = 500 ; The URI to view the FPM status page. If this value is not set, no URI will be ; recognized as a status page. By default, the status page shows the following ; information: ; accepted conn - the number of request accepted by the pool; ; pool - the name of the pool; ; process manager - static or dynamic; ; idle processes - the number of idle processes; ; active processes - the number of active processes; ; total processes - the number of idle + active processes. ; max children reached - number of times, the process limit has been reached, ; when pm tries to start more children (works only for ; pm 'dynamic') ; The values of 'idle processes', 'active processes' and 'total processes' are ; updated each second. The value of 'accepted conn' is updated in real time. ; Example output: ; accepted conn: 12073 ; pool: www ; process manager: static ; idle processes: 35 ; active processes: 65 ; total processes: 100 ; max children reached: 1 ; By default the status page output is formatted as text/plain. Passing either ; 'html' or 'json' as a query string will return the corresponding output ; syntax. Example: ; http://www.foo.bar/status ; http://www.foo.bar/status?json ; http://www.foo.bar/status?html ; Note: The value must start with a leading slash (/). The value can be ; anything, but it may not be a good idea to use the .php extension or it ; may conflict with a real PHP file. ; Default Value: not set pm.status_path = /status ; The ping URI to call the monitoring page of FPM. If this value is not set, no ; URI will be recognized as a ping page. This could be used to test from outside ; that FPM is alive and responding, or to ; - create a graph of FPM availability (rrd or such); ; - remove a server from a group if it is not responding (load balancing); ; - trigger alerts for the operating team (24/7). ; Note: The value must start with a leading slash (/). The value can be ; anything, but it may not be a good idea to use the .php extension or it ; may conflict with a real PHP file. ; Default Value: not set ping.path = /ping ; This directive may be used to customize the response of a ping request. The ; response is formatted as text/plain with a 200 response code. ; Default Value: pong ping.response = pong ; The timeout for serving a single request after which the worker process will ; be killed. This option should be used when the 'max_execution_time' ini option ; does not stop script execution for some reason. A value of '0' means 'off'. ; Available units: s(econds)(default), m(inutes), h(ours), or d(ays) ; Default Value: 0 ;request_terminate_timeout = 0 ; The timeout for serving a single request after which a PHP backtrace will be ; dumped to the 'slowlog' file. A value of '0s' means 'off'. ; Available units: s(econds)(default), m(inutes), h(ours), or d(ays) ; Default Value: 0 ;request_slowlog_timeout = 0 ; The log file for slow requests ; Default Value: not set ; Note: slowlog is mandatory if request_slowlog_timeout is set ;slowlog = log/$pool.log.slow ; Set open file descriptor rlimit. ; Default Value: system defined value ;rlimit_files = 1024 ; Set max core size rlimit. ; Possible Values: 'unlimited' or an integer greater or equal to 0 ; Default Value: system defined value ;rlimit_core = 0 ; Chroot to this directory at the start. This value must be defined as an ; absolute path. When this value is not set, chroot is not used. ; Note: you can prefix with '$prefix' to chroot to the pool prefix or one ; of its subdirectories. If the pool prefix is not set, the global prefix ; will be used instead. ; Note: chrooting is a great security feature and should be used whenever ; possible. However, all PHP paths will be relative to the chroot ; (error_log, sessions.save_path, ...). ; Default Value: not set ;chroot = ; Chdir to this directory at the start. ; Note: relative path can be used. ; Default Value: current directory or / when chroot ;chdir = /var/www ; Redirect worker stdout and stderr into main error log. If not set, stdout and ; stderr will be redirected to /dev/null according to FastCGI specs. ; Note: on highloaded environement, this can cause some delay in the page ; process time (several ms). ; Default Value: no ;catch_workers_output = yes ; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from ; the current environment. ; Default Value: clean env ;env[HOSTNAME] = $HOSTNAME ;env[PATH] = /usr/local/bin:/usr/bin:/bin ;env[TMP] = /tmp ;env[TMPDIR] = /tmp ;env[TEMP] = /tmp ; Additional php.ini defines, specific to this pool of workers. These settings ; overwrite the values previously defined in the php.ini. The directives are the ; same as the PHP SAPI: ; php_value/php_flag - you can set classic ini defines which can ; be overwritten from PHP call 'ini_set'. ; php_admin_value/php_admin_flag - these directives won't be overwritten by ; PHP call 'ini_set' ; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no. ; Defining 'extension' will load the corresponding shared extension from ; extension_dir. Defining 'disable_functions' or 'disable_classes' will not ; overwrite previously defined php.ini values, but will append the new value ; instead. ; Note: path INI options can be relative and will be expanded with the prefix ; (pool, global or /opt/php5) ; Default Value: nothing is defined by default except the values in php.ini and ; specified at startup with the -d argument ;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f [email protected] ;php_flag[display_errors] = off ;php_admin_value[error_log] = /var/log/fpm-php.www.log ;php_admin_flag[log_errors] = on ;php_admin_value[memory_limit] = 32M php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i

    Read the article

< Previous Page | 94 95 96 97 98