Search Results

Search found 2834 results on 114 pages for 'filesystem corruption'.

Page 106/114 | < Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113 | Next Page >

Master-slave vs. peer-to-peer archictecture: benefits and problems

- by Ashok_Ora

Normal 0 false false false EN-US X-NONE X-NONE Almost two decades ago, I was a member of a database development team that introduced adaptive locking. Locking, the most popular concurrency control technique in database systems, is pessimistic. Locking ensures that two or more conflicting operations on the same data item don’t “trample” on each other’s toes, resulting in data corruption. In a nutshell, here’s the issue we were trying to address. In everyday life, traffic lights serve the same purpose. They ensure that traffic flows smoothly and when everyone follows the rules, there are no accidents at intersections. As I mentioned earlier, the problem with typical locking protocols is that they are pessimistic. Regardless of whether there is another conflicting operation in the system or not, you have to hold a lock! Acquiring and releasing locks can be quite expensive, depending on how many objects the transaction touches. Every transaction has to pay this penalty. To use the earlier traffic light analogy, if you have ever waited at a red light in the middle of nowhere with no one on the road, wondering why you need to wait when there’s clearly no danger of a collision, you know what I mean. The adaptive locking scheme that we invented was able to minimize the number of locks that a transaction held, by detecting whether there were one or more transactions that needed conflicting eyou could get by without holding any lock at all. In many “well-behaved” workloads, there are few conflicts, so this optimization is a huge win. If, on the other hand, there are many concurrent, conflicting requests, the algorithm gracefully degrades to the “normal” behavior with minimal cost. We were able to reduce the number of lock requests per TPC-B transaction from 178 requests down to 2! Wow! This is a dramatic improvement in concurrency as well as transaction latency. The lesson from this exercise was that if you can identify the common scenario and optimize for that case so that only the uncommon scenarios are more expensive, you can make dramatic improvements in performance without sacrificing correctness. So how does this relate to the architecture and design of some of the modern NoSQL systems? NoSQL systems can be broadly classified as master-slave sharded, or peer-to-peer sharded systems. NoSQL systems with a peer-to-peer architecture have an interesting way of handling changes. Whenever an item is changed, the client (or an intermediary) propagates the changes synchronously or asynchronously to multiple copies (for availability) of the data. Since the change can be propagated asynchronously, during some interval in time, it will be the case that some copies have received the update, and others haven’t. What happens if someone tries to read the item during this interval? The client in a peer-to-peer system will fetch the same item from multiple copies and compare them to each other. If they’re all the same, then every copy that was queried has the same (and up-to-date) value of the data item, so all’s good. If not, then the system provides a mechanism to reconcile the discrepancy and to update stale copies. So what’s the problem with this? There are two major issues: First, IT’S HORRIBLY PESSIMISTIC because, in the common case, it is unlikely that the same data item will be updated and read from different locations at around the same time! For every read operation, you have to read from multiple copies. That’s a pretty expensive, especially if the data are stored in multiple geographically separate locations and network latencies are high. Second, if the copies are not all the same, the application has to reconcile the differences and propagate the correct value to the out-dated copies. This means that the application program has to handle discrepancies in the different versions of the data item and resolve the issue (which can further add to cost and operation latency). Resolving discrepancies is only one part of the problem. What if the same data item was updated independently on two different nodes (copies)? In that case, due to the asynchronous nature of change propagation, you might land up with different versions of the data item in different copies. In this case, the application program also has to resolve conflicts and then propagate the correct value to the copies that are out-dated or have incorrect versions. This can get really complicated. My hunch is that there are many peer-to-peer-based applications that don’t handle this correctly, and worse, don’t even know it. Imagine have 100s of millions of records in your database – how can you tell whether a particular data item is incorrect or out of date? And what price are you willing to pay for ensuring that the data can be trusted? Multiple network messages per read request? Discrepancy and conflict resolution logic in the application, and potentially, additional messages? All this overhead, when all you were trying to do was to read a data item. Wouldn’t it be simpler to avoid this problem in the first place? Master-slave architectures like the Oracle NoSQL Database handles this very elegantly. A change to a data item is always sent to the master copy. Consequently, the master copy always has the most current and authoritative version of the data item. The master is also responsible for propagating the change to the other copies (for availability and read scalability). Client drivers are aware of master copies and replicas, and client drivers are also aware of the “currency” of a replica. In other words, each NoSQL Database client knows how stale a replica is. This vastly simplifies the job of the application developer. If the application needs the most current version of the data item, the client driver will automatically route the request to the master copy. If the application is willing to tolerate some staleness of data (e.g. a version that is no more than 1 second out of date), the client can easily determine which replica (or set of replicas) can satisfy the request, and route the request to the most efficient copy. This results in a dramatic simplification in application logic and also minimizes network requests (the driver will only send the request to exactl the right replica, not many). So, back to my original point. A well designed and well architected system minimizes or eliminates unnecessary overhead and avoids pessimistic algorithms wherever possible in order to deliver a highly efficient and high performance system. If you’ve every programmed an Oracle NoSQL Database application, you’ll know the difference! /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article
?Oracle????SELECT????UNDO

- by Liu Maclean(???)

????????Oracle?????(dirty read),?Oracle??????Asktom????????Oracle???????, ???undo??????????(before image)??????Consistent, ???????????????Oracle????????????? ????????? ??,??,Oracle?????????????RDBMS,???????????? ?????????2?????: _offline_rollback_segments or _corrupted_rollback_segments ?2?????????Oracle???????????ORA-600[4XXX]???????????????,???2??????Undo??Corruption????????????,?????2????????????????? ??????????????_offline_rollback_segments ? _corrupted_rollback_segments ?2?????: ???????(FORCE OPEN DATABASE) ????????????(consistent read & delayed block cleanout) ??????rollback segment??? ?????:???????Oracle????????,??????????2?????,?????????????!! _offline_rollback_segments ? _corrupted_rollback_segments ???????????: ??2???????Undo Segments(???/???)????????online ?UNDO$???????????OFFLINE??? ???instance??????????????????? ??????Undo Segments????????active transaction????????????dead??SMON???(????????SMON??(?):Recover Dead transaction) _OFFLINE_ROLLBACK_SEGMENTS(offline undo segment list)????(hidden parameter)?????: ???startup???open database???????_OFFLINE_ROLLBACK_SEGMENTS????Undo segments(???/???),?????undo segments????????alert.log???TRACE?????,???????startup?? ?????????????,?ITL?????undo segments?: ???undo segments?transaction table?????????????????? ???????????commit,?????CR??? ????undo segments????(???corrupted??,???missed??)???????????alert.log,??????? ?DML?????????????????????????????????CPU,????????????????????? _CORRUPTED_ROLLBACK_SEGMENTS(corrupted undo segment list)??????????: ?????startup?open database???_CORRUPTED_ROLLBACK_SEGMENTS????undo segments(???/???)???????? ???????_CORRUPTED_ROLLBACK_SEGMENTS???undo segments????????????commit,???undo segments???drop??? ??????????? ??????????????????,?????????????????? ??bootstrap???????????,?????????ORA-00704: bootstrap process failure??,???????????(???Oracle????:??ORA-00600:[4000] ORA-00704: bootstrap process failure????) ??????_CORRUPTED_ROLLBACK_SEGMENTS????????????????????,??????????????? Oracle???????TXChecker??????????? ???????2?????,??????????????_CORRUPTED_ROLLBACK_SEGMENTS?????SELECT????UNDO???????: SQL> alter system set event= '10513 trace name context forever, level 2' scope=spfile; System altered. SQL> alter system set "_in_memory_undo"=false scope=spfile; System altered. 10513 level 2 event????SMON ??rollback ??? dead transaction _in_memory_undo ?? in memory undo ?? SQL> startup force; ORACLE instance started. Total System Global Area 3140026368 bytes Fixed Size 2232472 bytes Variable Size 1795166056 bytes Database Buffers 1325400064 bytes Redo Buffers 17227776 bytes Database mounted. Database opened. session A: SQL> conn maclean/maclean Connected. SQL> create table maclean tablespace users as select 1 t1 from dual connect by level exec dbms_stats.gather_table_stats('','MACLEAN'); PL/SQL procedure successfully completed. SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 501 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 515 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processe ???????????,????current block, ????????,consistent gets??3? SQL> update maclean set t1=0; 501 rows updated. SQL> alter system checkpoint; System altered. ??session A?commit; ???? session: SQL> conn maclean/maclean Connected. SQL> SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 501 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 505 consistent gets 0 physical reads 108 redo size 515 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ?????? ?????????undo??CR?,???consistent gets??? 505 [oracle@vrh8 ~]$ ps -ef|grep LOCAL=YES |grep -v grep oracle 5841 5839 0 09:17 ? 00:00:00 oracleG10R25 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) [oracle@vrh8 ~]$ kill -9 5841 ??session A???Server Process????,???dead transaction ????smon?? select ktuxeusn, to_char(sysdate, 'DD-MON-YYYY HH24:MI:SS') "Time", ktuxesiz, ktuxesta from x$ktuxe where ktuxecfl = 'DEAD'; KTUXEUSN Time KTUXESIZ KTUXESTA ---------- -------------------- ---------- ---------------- 2 06-AUG-2012 09:20:45 7 ACTIVE ???1?active rollback segment SQL> conn maclean/maclean Connected. SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 501 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 411 consistent gets 0 physical reads 108 redo size 515 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ????? ????kill?? ???smon ??dead transaction , ???????????? ?????undo??????? ????active?rollback segment??? SQL> select segment_name from dba_rollback_segs where segment_id=2; SEGMENT_NAME ------------------------------ _SYSSMU2$ SQL> alter system set "_corrupted_rollback_segments"='_SYSSMU2$' scope=spfile; System altered. ? _corrupted_rollback_segments ?? ???2?rollback segment, ????????undo SQL> startup force; ORACLE instance started. Total System Global Area 3140026368 bytes Fixed Size 2232472 bytes Variable Size 1795166056 bytes Database Buffers 1325400064 bytes Redo Buffers 17227776 bytes Database mounted. Database opened. SQL> conn maclean/maclean Connected. SQL> set autotrace on; SQL> select sum(t1) from maclean; SUM(T1) ---------- 94 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 228 recursive calls 0 db block gets 29 consistent gets 5 physical reads 116 redo size 514 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 4 sorts (memory) 0 sorts (disk) 1 rows processed SQL> / SUM(T1) ---------- 94 Execution Plan ---------------------------------------------------------- Plan hash value: 1679547536 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 3 | | | | 2 | TABLE ACCESS FULL| MACLEAN | 501 | 1503 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------ Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 514 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ?????? consistent gets???3,?????????????????,??ITL???UNDO SEGMENTS?_corrupted_rollback_segments????,???????????COMMIT??,????UNDO? ???????,?????????????????????????(????????????????????),????????????????? ???? , ?????

Read the article
3Ware 9650SE RAID-6, two degraded drives, one ECC, rebuild stuck

- by cswingle

This morning I came in the office to discover that two of the drives on a RAID-6, 3ware 9650SE controller were marked as degraded and it was rebuilding the array. After getting to about 4%, it got ECC errors on a third drive (this may have happened when I attempted to access the filesystem on this RAID and got I/O errors from the controller). Now I'm in this state: > /c2/u1 show Unit UnitType Status %RCmpl %V/I/M Port Stripe Size(GB) ------------------------------------------------------------------------ u1 RAID-6 REBUILDING 4%(A) - - 64K 7450.5 u1-0 DISK OK - - p5 - 931.312 u1-1 DISK OK - - p2 - 931.312 u1-2 DISK OK - - p1 - 931.312 u1-3 DISK OK - - p4 - 931.312 u1-4 DISK OK - - p11 - 931.312 u1-5 DISK DEGRADED - - p6 - 931.312 u1-6 DISK OK - - p7 - 931.312 u1-7 DISK DEGRADED - - p3 - 931.312 u1-8 DISK WARNING - - p9 - 931.312 u1-9 DISK OK - - p10 - 931.312 u1/v0 Volume - - - - - 7450.5 Examining the SMART data on the three drives in question, the two that are DEGRADED are in good shape (PASSED without any Current_Pending_Sector or Offline_Uncorrectable errors), but the drive listed as WARNING has 24 uncorrectable sectors. And, the "rebuild" has been stuck at 4% for ten hours now. So: How do I get it to start actually rebuilding? This particular controller doesn't appear to support /c2/u1 resume rebuild, and the only rebuild command that appears to be an option is one that wants to know what disk to add (/c2/u1 start rebuild disk=<p:-p...> [ignoreECC] according to the help). I have two hot spares in the server, and I'm happy to engage them, but I don't understand what it would do with that information in the current state it's in. Can I pull out the drive that is demonstrably failing (the WARNING drive), when I have two DEGRADED drives in a RAID-6? It seems to me that the best scenario would be for me to pull the WARNING drive and tell it to use one of my hot spares in the rebuild. But won't I kill the thing by pulling a "good" drive in a RAID-6 with two DEGRADED drives? Finally, I've seen reference in other posts to a bad bug in this controller that causes good drives to be marked as bad and that upgrading the firmware may help. Is flashing the firmware a risky operation given the situation? Is it likely to help or hurt wrt the rebuilding-but-stuck-at-4% RAID? Am I experiencing this bug in action? Advice outside the spiritual would be much appreciated. Thanks.

Read the article
External USB drive is failing

- by dma_k

I have an external USB 2.0 drive WD My Book Mirror Edition, running in RAID 1 (mirroring) mode. A while ago the hard drive started to fail: it stops responding (directories are not listed returning an error after a big timeout). Sometimes it works for weeks before a failure, sometimes – few hours. Small write operations (like removing few files or editing a small file) do not harm, but when copying large files to the drive over the network, or creating the archive locally, the kernel dumps. Also interesting to note that once kernel has failed, Linux does not want to reboot normally (reboot hangs); when Linux box is shutdown with power button, WD drive does not go to sleep mode (as it usually does): leds continue to run, pressing and holding the "shutdown" button on drive's back panel does not do anything; only unplugging the power cord helps. Here goes the boot log: Aug 16 00:32:21 kernel: [ 1.514106] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver Aug 16 00:32:21 kernel: [ 1.657738] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23 Aug 16 00:32:21 kernel: [ 1.673747] ehci_hcd 0000:00:1d.7: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 1.673751] ehci_hcd 0000:00:1d.7: EHCI Host Controller Aug 16 00:32:21 kernel: [ 1.725224] ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1 Aug 16 00:32:21 kernel: [ 1.741647] ehci_hcd 0000:00:1d.7: using broken periodic workaround Aug 16 00:32:21 kernel: [ 1.761790] ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported Aug 16 00:32:21 kernel: [ 1.761873] ehci_hcd 0000:00:1d.7: irq 23, io mem 0xfdfff000 Aug 16 00:32:21 kernel: [ 1.796043] ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00 Aug 16 00:32:21 kernel: [ 1.879069] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 Aug 16 00:32:21 kernel: [ 1.895446] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 1.911796] usb usb1: Product: EHCI Host Controller Aug 16 00:32:21 kernel: [ 1.928015] usb usb1: Manufacturer: Linux 2.6.32-5-686 ehci_hcd Aug 16 00:32:21 kernel: [ 1.944331] usb usb1: SerialNumber: 0000:00:1d.7 Aug 16 00:32:21 kernel: [ 1.961285] usb usb1: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 1.994412] hub 1-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 2.010864] hub 1-0:1.0: 8 ports detected Aug 16 00:32:21 kernel: [ 2.085939] uhci_hcd: USB Universal Host Controller Interface driver Aug 16 00:32:21 kernel: [ 2.191945] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23 Aug 16 00:32:21 kernel: [ 2.226029] uhci_hcd 0000:00:1d.0: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 2.226034] uhci_hcd 0000:00:1d.0: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.243237] uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2 Aug 16 00:32:21 kernel: [ 2.260390] uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000fe00 Aug 16 00:32:21 kernel: [ 2.277517] usb usb2: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 2.294815] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 2.312173] usb usb2: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.329534] usb usb2: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 2.346828] usb usb2: SerialNumber: 0000:00:1d.0 Aug 16 00:32:21 kernel: [ 2.412989] usb usb2: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 2.430651] usb 1-2: new high speed USB device using ehci_hcd and address 2 Aug 16 00:32:21 kernel: [ 2.449046] hub 2-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 2.466514] hub 2-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 2.484639] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19 Aug 16 00:32:21 kernel: [ 2.537750] uhci_hcd 0000:00:1d.1: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 2.537756] uhci_hcd 0000:00:1d.1: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.555085] uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3 Aug 16 00:32:21 kernel: [ 2.572231] uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000fd00 Aug 16 00:32:21 kernel: [ 2.589593] usb usb3: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 2.606869] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 2.624134] usb usb3: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.641329] usb usb3: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 2.658505] usb usb3: SerialNumber: 0000:00:1d.1 Aug 16 00:32:21 kernel: [ 2.675843] usb usb3: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 2.692864] hub 3-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 2.709651] hub 3-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 2.727378] uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 Aug 16 00:32:21 kernel: [ 2.768252] uhci_hcd 0000:00:1d.2: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 2.768258] uhci_hcd 0000:00:1d.2: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.806679] uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4 Aug 16 00:32:21 kernel: [ 2.824117] uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000fc00 Aug 16 00:32:21 kernel: [ 2.841405] usb 1-2: New USB device found, idVendor=1058, idProduct=1104 Aug 16 00:32:21 kernel: [ 2.858448] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Aug 16 00:32:21 kernel: [ 2.875347] usb 1-2: Product: My Book Aug 16 00:32:21 kernel: [ 2.892113] usb 1-2: Manufacturer: Western Digital Aug 16 00:32:21 kernel: [ 2.908915] usb 1-2: SerialNumber: 575532553130303530353538 Aug 16 00:32:21 kernel: [ 2.943242] usb usb4: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 2.960405] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 2.977615] usb usb4: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 2.994687] usb usb4: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 3.011711] usb usb4: SerialNumber: 0000:00:1d.2 Aug 16 00:32:21 kernel: [ 3.029589] usb usb4: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 3.082027] sd 2:0:0:0: [sda] Attached SCSI disk Aug 16 00:32:21 kernel: [ 3.103953] usb 1-2: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 3.122625] hub 4-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 3.140484] hub 4-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 3.161680] uhci_hcd 0000:00:1d.3: PCI INT D -> GSI 16 (level, low) -> IRQ 16 Aug 16 00:32:21 kernel: [ 3.181257] uhci_hcd 0000:00:1d.3: setting latency timer to 64 Aug 16 00:32:21 kernel: [ 3.181263] uhci_hcd 0000:00:1d.3: UHCI Host Controller Aug 16 00:32:21 kernel: [ 3.198614] uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5 Aug 16 00:32:21 kernel: [ 3.216012] uhci_hcd 0000:00:1d.3: irq 16, io base 0x0000fb00 Aug 16 00:32:21 kernel: [ 3.249877] Uniform CD-ROM driver Revision: 3.20 Aug 16 00:32:21 kernel: [ 3.267765] usb usb5: New USB device found, idVendor=1d6b, idProduct=0001 Aug 16 00:32:21 kernel: [ 3.284947] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1 Aug 16 00:32:21 kernel: [ 3.302023] usb usb5: Product: UHCI Host Controller Aug 16 00:32:21 kernel: [ 3.319215] usb usb5: Manufacturer: Linux 2.6.32-5-686 uhci_hcd Aug 16 00:32:21 kernel: [ 3.336298] usb usb5: SerialNumber: 0000:00:1d.3 Aug 16 00:32:21 kernel: [ 3.368377] Initializing USB Mass Storage driver... Aug 16 00:32:21 kernel: [ 3.390652] usbcore: registered new interface driver hiddev Aug 16 00:32:21 kernel: [ 3.408109] scsi4 : SCSI emulation for USB Mass Storage devices Aug 16 00:32:21 kernel: [ 3.425281] sr 0:0:1:0: Attached scsi CD-ROM sr0 Aug 16 00:32:21 kernel: [ 3.438978] sr 0:0:1:0: Attached scsi generic sg0 type 5 Aug 16 00:32:21 kernel: [ 3.456328] usbcore: registered new interface driver usb-storage Aug 16 00:32:21 kernel: [ 3.474564] usb-storage: device found at 2 Aug 16 00:32:21 kernel: [ 3.474567] usb-storage: waiting for device to settle before scanning Aug 16 00:32:21 kernel: [ 3.475320] sd 2:0:0:0: Attached scsi generic sg1 type 0 Aug 16 00:32:21 kernel: [ 3.492587] USB Mass Storage support registered. Aug 16 00:32:21 kernel: [ 3.510930] usb usb5: configuration #1 chosen from 1 choice Aug 16 00:32:21 kernel: [ 3.531076] hub 5-0:1.0: USB hub found Aug 16 00:32:21 kernel: [ 3.548399] hub 5-0:1.0: 2 ports detected Aug 16 00:32:21 kernel: [ 3.591743] input: Western Digital My Book as /devices/pci0000:00/0000:00:1d.7/usb1/1-2/1-2:1.1/input/input2 Aug 16 00:32:21 kernel: [ 3.609515] generic-usb 0003:1058:1104.0001: input,hidraw0: USB HID v1.11 Device [Western Digital My Book] on usb-0000:00:1d.7-2/input1 Aug 16 00:32:21 kernel: [ 3.627466] usbcore: registered new interface driver usbhid Aug 16 00:32:21 kernel: [ 8.581664] usb-storage: device scan complete Aug 16 00:32:21 kernel: [ 8.624270] scsi 4:0:0:0: Direct-Access WD My Book 1008 PQ: 0 ANSI: 4 Aug 16 00:32:21 kernel: [ 8.655135] scsi 4:0:0:1: Enclosure WD My Book Device 1008 PQ: 0 ANSI: 4 Aug 16 00:32:21 kernel: [ 8.675393] sd 4:0:0:0: Attached scsi generic sg2 type 0 Aug 16 00:32:21 kernel: [ 8.698669] scsi 4:0:0:1: Attached scsi generic sg3 type 13 Aug 16 00:32:21 kernel: [ 8.723370] sd 4:0:0:0: [sdb] 1953513472 512-byte logical blocks: (1.00 TB/931 GiB) Aug 16 00:32:21 kernel: [ 8.750477] sd 4:0:0:0: [sdb] Write Protect is off Aug 16 00:32:21 kernel: [ 8.769411] sd 4:0:0:0: [sdb] Mode Sense: 10 00 00 00 Aug 16 00:32:21 kernel: [ 8.769414] sd 4:0:0:0: [sdb] Assuming drive cache: write through Aug 16 00:32:21 kernel: [ 8.822971] sd 4:0:0:0: [sdb] Assuming drive cache: write through Aug 16 00:32:21 kernel: [ 8.841978] sdb: sdb1 Aug 16 00:32:21 kernel: [ 8.905580] sd 4:0:0:0: [sdb] Assuming drive cache: write through Aug 16 00:32:21 kernel: [ 8.924173] sd 4:0:0:0: [sdb] Attached SCSI disk Aug 16 00:32:21 kernel: [ 11.600492] XFS mounting filesystem sdb1 Aug 16 00:32:21 kernel: [ 12.222948] Ending clean XFS mount for filesystem: sdb1 After a while the following appears in a log: Aug 16 09:30:56 kernel: [32359.112029] usb 1-2: reset high speed USB device using ehci_hcd and address 2 Aug 16 09:31:59 kernel: [32422.112035] usb 1-2: reset high speed USB device using ehci_hcd and address 2 Aug 16 09:33:00 kernel: [32483.112029] usb 1-2: reset high speed USB device using ehci_hcd and address 2 And then it is followed by few kernel dumps, which I think, are not good: Aug 16 09:33:40 kernel: [32520.428027] INFO: task xfssyncd:1002 blocked for more than 120 seconds. Aug 16 09:33:40 kernel: [32520.462689] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 16 09:33:40 kernel: [32520.497422] xfssyncd D c3d84a60 0 1002 2 0x00000000 Aug 16 09:33:40 kernel: [32520.532117] f6c9aa80 00000046 c1132742 c3d84a60 00000286 c1418100 c1418100 00000000 Aug 16 09:33:40 kernel: [32520.566867] f6c9ac3c c2808100 00000000 f653b18b 00001d76 00000001 f6c9aa80 c3c3f0e0 Aug 16 09:33:40 kernel: [32520.601343] 08e59242 f6c9ac3c 2e41392b 00000000 08e59242 00000000 c3f7fb48 0067385a Aug 16 09:33:40 kernel: [32520.635533] Call Trace: Aug 16 09:33:40 kernel: [32520.668991] [<c1132742>] ? cfq_set_request+0x0/0x290 Aug 16 09:33:40 kernel: [32520.702804] [<c126b532>] ? io_schedule+0x5f/0x98 Aug 16 09:33:40 kernel: [32520.736555] [<c1128be0>] ? get_request_wait+0xcb/0x146 Aug 16 09:33:40 kernel: [32520.770360] [<c10437ba>] ? autoremove_wake_function+0x0/0x2d Aug 16 09:33:40 kernel: [32520.804110] [<c112907c>] ? __make_request+0x2cc/0x3d9 Aug 16 09:33:40 kernel: [32520.837713] [<c1128230>] ? blk_peek_request+0x135/0x143 Aug 16 09:33:40 kernel: [32520.871265] [<f8582987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod] Aug 16 09:33:40 kernel: [32520.904407] [<c1127cf1>] ? generic_make_request+0x266/0x2b4 Aug 16 09:33:40 kernel: [32520.937007] [<c10cf821>] ? bvec_alloc_bs+0x95/0xaf Aug 16 09:33:40 kernel: [32520.969033] [<c1127dfb>] ? submit_bio+0xbc/0xd6 Aug 16 09:33:40 kernel: [32521.000485] [<c10cffd1>] ? bio_add_page+0x28/0x2e Aug 16 09:33:40 kernel: [32521.031403] [<f8918d38>] ? _xfs_buf_ioapply+0x206/0x22b [xfs] Aug 16 09:33:40 kernel: [32521.061888] [<f89197bd>] ? xfs_buf_iorequest+0x38/0x60 [xfs] Aug 16 09:33:40 kernel: [32521.091845] [<f8907230>] ? xlog_bdstrat_cb+0x16/0x3d [xfs] Aug 16 09:33:40 kernel: [32521.121222] [<f8905781>] ? XFS_bwrite+0x32/0x64 [xfs] Aug 16 09:33:40 kernel: [32521.150007] [<f89059be>] ? xlog_sync+0x20b/0x311 [xfs] Aug 16 09:33:40 kernel: [32521.178214] [<f89112fc>] ? xfs_trans_ail_tail+0x12/0x27 [xfs] Aug 16 09:33:40 kernel: [32521.205914] [<f8906261>] ? xlog_state_sync_all+0xa2/0x141 [xfs] Aug 16 09:33:40 kernel: [32521.233074] [<f8906611>] ? _xfs_log_force+0x51/0x68 [xfs] Aug 16 09:33:40 kernel: [32521.259664] [<c103abaf>] ? process_timeout+0x0/0x5 Aug 16 09:33:40 kernel: [32521.285662] [<f8906636>] ? xfs_log_force+0xe/0x27 [xfs] Aug 16 09:33:40 kernel: [32521.311171] [<f89202df>] ? xfs_sync_worker+0x17/0x5c [xfs] Aug 16 09:33:40 kernel: [32521.336117] [<f891fbb7>] ? xfssyncd+0x134/0x17d [xfs] Aug 16 09:33:40 kernel: [32521.360498] [<f891fa83>] ? xfssyncd+0x0/0x17d [xfs] Aug 16 09:33:40 kernel: [32521.384211] [<c1043588>] ? kthread+0x61/0x66 Aug 16 09:33:40 kernel: [32521.407890] [<c1043527>] ? kthread+0x0/0x66 Aug 16 09:33:40 kernel: [32521.430876] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 Aug 16 09:33:40 kernel: [32521.453394] INFO: task flush-8:16:12945 blocked for more than 120 seconds. Aug 16 09:33:40 kernel: [32521.476116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 16 09:33:40 kernel: [32521.498579] flush-8:16 D 00000000 0 12945 2 0x00000000 Aug 16 09:33:40 kernel: [32521.520649] f4e4d540 00000046 e412e940 00000000 00000002 c1418100 c1418100 c14136ac Aug 16 09:33:40 kernel: [32521.542426] f4e4d6fc c2808100 00000000 00000000 000008b4 00000001 f4e4d540 c3c3f0e0 Aug 16 09:33:40 kernel: [32521.563745] 02e905a8 f4e4d6fc 007a5399 00000000 02e905a8 00000000 f4e2db48 00670b98 Aug 16 09:33:40 kernel: [32521.585077] Call Trace: Aug 16 09:33:40 kernel: [32521.605790] [<c126b532>] ? io_schedule+0x5f/0x98 Aug 16 09:33:40 kernel: [32521.626184] [<c1128be0>] ? get_request_wait+0xcb/0x146 Aug 16 09:33:40 kernel: [32521.646133] [<c10437ba>] ? autoremove_wake_function+0x0/0x2d Aug 16 09:33:40 kernel: [32521.665659] [<c112907c>] ? __make_request+0x2cc/0x3d9 Aug 16 09:33:40 kernel: [32521.684716] [<f891796e>] ? xfs_convert_page+0x30a/0x331 [xfs] Aug 16 09:33:40 kernel: [32521.703366] [<c1127cf1>] ? generic_make_request+0x266/0x2b4 Aug 16 09:33:40 kernel: [32521.721644] [<c10cf821>] ? bvec_alloc_bs+0x95/0xaf Aug 16 09:33:40 kernel: [32521.739465] [<c1127dfb>] ? submit_bio+0xbc/0xd6 Aug 16 09:33:40 kernel: [32521.756896] [<c10cfa45>] ? bio_alloc_bioset+0x7b/0xba Aug 16 09:33:40 kernel: [32521.774046] [<f8917af0>] ? xfs_submit_ioend_bio+0x3b/0x44 [xfs] Aug 16 09:33:40 kernel: [32521.790694] [<f8917ba3>] ? xfs_submit_ioend+0xaa/0xc4 [xfs] Aug 16 09:33:40 kernel: [32521.806736] [<f891817d>] ? xfs_page_state_convert+0x5c0/0x61c [xfs] Aug 16 09:33:40 kernel: [32521.822859] [<c113705b>] ? __lookup_tag+0x8e/0xee Aug 16 09:33:40 kernel: [32521.838958] [<f891840d>] ? xfs_vm_writepage+0x91/0xc4 [xfs] Aug 16 09:33:40 kernel: [32521.855039] [<c108bbcc>] ? __writepage+0x8/0x22 Aug 16 09:33:40 kernel: [32521.871067] [<c108c17b>] ? write_cache_pages+0x1af/0x29f Aug 16 09:33:40 kernel: [32521.886616] [<c108bbc4>] ? __writepage+0x0/0x22 Aug 16 09:33:40 kernel: [32521.901593] [<c108c285>] ? generic_writepages+0x1a/0x21 Aug 16 09:33:40 kernel: [32521.916455] [<f8918338>] ? xfs_vm_writepages+0x0/0x38 [xfs] Aug 16 09:33:40 kernel: [32521.931484] [<c108c2a5>] ? do_writepages+0x19/0x25 Aug 16 09:33:40 kernel: [32521.946648] [<c10c80d9>] ? writeback_single_inode+0xc7/0x273 Aug 16 09:33:40 kernel: [32521.961675] [<c10c8c44>] ? writeback_inodes_wb+0x3dd/0x49c Aug 16 09:33:40 kernel: [32521.976831] [<c10c8e18>] ? wb_writeback+0x115/0x178 Aug 16 09:33:40 kernel: [32521.991778] [<c10c901f>] ? wb_do_writeback+0x121/0x131 Aug 16 09:33:40 kernel: [32522.006538] [<c103abaf>] ? process_timeout+0x0/0x5 Aug 16 09:33:40 kernel: [32522.021091] [<c10c9050>] ? bdi_writeback_task+0x21/0x89 Aug 16 09:33:40 kernel: [32522.035493] [<c10979e5>] ? bdi_start_fn+0x59/0xa4 Aug 16 09:33:40 kernel: [32522.049765] [<c109798c>] ? bdi_start_fn+0x0/0xa4 Aug 16 09:33:40 kernel: [32522.063792] [<c1043588>] ? kthread+0x61/0x66 Aug 16 09:33:40 kernel: [32522.077612] [<c1043527>] ? kthread+0x0/0x66 Aug 16 09:33:40 kernel: [32522.091260] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 Aug 16 09:33:40 kernel: [32522.104966] INFO: task smartctl:13098 blocked for more than 120 seconds. Aug 16 09:33:40 kernel: [32522.118883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 16 09:33:40 kernel: [32522.133012] smartctl D 00000020 0 13098 13097 0x00000000 Aug 16 09:33:40 kernel: [32522.147221] e50b9540 00000086 c11d28a8 00000020 00000770 c1418100 c1418100 c14136ac Aug 16 09:33:40 kernel: [32522.161720] e50b96fc c2808100 00000000 e53e8800 00000000 00000020 c3cec000 c13886c0 Aug 16 09:33:40 kernel: [32522.176217] f99dab68 e50b96fc 007a4f1e 00000001 c4082f24 c4082ed8 00000001 c3c3f0e0 Aug 16 09:33:40 kernel: [32522.190737] Call Trace: Aug 16 09:33:40 kernel: [32522.205038] [<c11d28a8>] ? __netdev_alloc_skb+0x14/0x2d Aug 16 09:33:40 kernel: [32522.219605] [<c126b799>] ? schedule_timeout+0x20/0xb0 Aug 16 09:33:40 kernel: [32522.234144] [<c112820d>] ? blk_peek_request+0x112/0x143 Aug 16 09:33:40 kernel: [32522.248649] [<f85873b6>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod] Aug 16 09:33:40 kernel: [32522.263233] [<c103aba8>] ? del_timer+0x55/0x5c Aug 16 09:33:40 kernel: [32522.277773] [<c126b6a2>] ? wait_for_common+0xa4/0x100 Aug 16 09:33:40 kernel: [32522.292342] [<c102cd8d>] ? default_wake_function+0x0/0x8 Aug 16 09:33:40 kernel: [32522.306958] [<c112b3d1>] ? blk_execute_rq+0x8b/0xb2 Aug 16 09:33:40 kernel: [32522.321569] [<c112b2ac>] ? blk_end_sync_rq+0x0/0x23 Aug 16 09:33:40 kernel: [32522.336070] [<c112b58b>] ? blk_recount_segments+0x13/0x20 Aug 16 09:33:40 kernel: [32522.350583] [<c1127307>] ? blk_rq_bio_prep+0x44/0x74 Aug 16 09:33:40 kernel: [32522.365059] [<c112b0b2>] ? blk_rq_map_kern+0xc5/0xee Aug 16 09:33:40 kernel: [32522.379439] [<c112e2a5>] ? sg_scsi_ioctl+0x221/0x2aa Aug 16 09:33:40 kernel: [32522.393801] [<c112e672>] ? scsi_cmd_ioctl+0x344/0x39a Aug 16 09:33:40 kernel: [32522.408140] [<c1024c87>] ? update_curr+0x106/0x1b3 Aug 16 09:33:40 kernel: [32522.422566] [<c1024c87>] ? update_curr+0x106/0x1b3 Aug 16 09:33:40 kernel: [32522.436832] [<f87676aa>] ? sd_ioctl+0x90/0xb5 [sd_mod] Aug 16 09:33:40 kernel: [32522.451228] [<c112c35f>] ? __blkdev_driver_ioctl+0x53/0x63 Aug 16 09:33:40 kernel: [32522.465689] [<c112cbbf>] ? blkdev_ioctl+0x850/0x891 Aug 16 09:33:40 kernel: [32522.479982] [<c1020474>] ? __wake_up_common+0x34/0x59 Aug 16 09:33:40 kernel: [32522.494138] [<c10244cd>] ? complete+0x28/0x36 Aug 16 09:33:40 kernel: [32522.507986] [<c1086c64>] ? find_get_page+0x1f/0x81 Aug 16 09:33:40 kernel: [32522.521671] [<c10abed5>] ? add_partial+0xe/0x40 Aug 16 09:33:40 kernel: [32522.535285] [<c1086e68>] ? lock_page+0x8/0x1d Aug 16 09:33:40 kernel: [32522.548797] [<c1087432>] ? filemap_fault+0xb5/0x2e6 Aug 16 09:33:40 kernel: [32522.562141] [<c109941c>] ? __do_fault+0x381/0x3b1 Aug 16 09:33:40 kernel: [32522.575441] [<c10d0c30>] ? block_ioctl+0x27/0x2c Aug 16 09:33:40 kernel: [32522.588708] [<c10d0c09>] ? block_ioctl+0x0/0x2c Aug 16 09:33:40 kernel: [32522.601858] [<c10bcd78>] ? vfs_ioctl+0x1c/0x5f Aug 16 09:33:40 kernel: [32522.614917] [<c10bd30c>] ? do_vfs_ioctl+0x4aa/0x4e5 Aug 16 09:33:40 kernel: [32522.627961] [<c10350db>] ? __do_softirq+0x115/0x151 Aug 16 09:33:40 kernel: [32522.640901] [<c126e270>] ? do_page_fault+0x2f1/0x307 Aug 16 09:33:40 kernel: [32522.653803] [<c10bd388>] ? sys_ioctl+0x41/0x58 Aug 16 09:33:40 kernel: [32522.666674] [<c10030fb>] ? sysenter_do_call+0x12/0x28 Then again few messages reset high speed USB device using ehci_hcd and address 2. I have browsed and read similar error reports here and there and I tried: I have upgraded the kernel from v2.6.26-2 to 2.6.32-5, which has not solved the problem. They say, this might a cable problem. I have tried to replace the USB-to-miniUSB cable (that connects external drive with computer) with another one. No changes. Somebody suggests to try another USB port. I have only 4 external USB ports, tried another one with no success. They say to try uhci_hcd. I have unmounted the device, unloaded ehci_hcd and mounted again. The difference was that now in log I get reset full speed USB device using uhci_hcd and address 2 and similar kernel dumps after a while. They say to echo 128 > /sys/block/sdb/device/max_sectors. I tried it with ehci_hcd with no success (note: I have issued this command after the drive was mounted but before using it actively). I have lauched smartmond and checking periodically the output of smartctl: drive temperature is OK, number of bad sectors and uncorrectable errors is 0. Nothing suspicious is reported by S.M.A.R.T. except maybe the following: Aug 16 12:40:12 kernel: [43715.314566] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Aug 16 12:40:13 kernel: [43715.705622] program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO Of course, I have not tried all combinations of above. But unfortunately, I am run out of cardinal ideas. If anybody can advice something specific about the problem, you are very welcome.

Read the article
ubuntu bind9 AppArmor read permission denied (chroot jail)

- by Richard Whitman

I am trying to run bind9 with chroot jail. I followed the steps mentioned at : http://www.howtoforge.com/debian_bind9_master_slave_system I am getting the following errors in my syslog: Jul 27 16:53:49 conf002 named[3988]: starting BIND 9.7.3 -u bind -t /var/lib/named Jul 27 16:53:49 conf002 named[3988]: built with '--prefix=/usr' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/bind' '--localstatedir=/var' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-gnu-ld' '--with-dlz-postgres=no' '--with-dlz-mysql=no' '--with-dlz-bdb=yes' '--with-dlz-filesystem=yes' '--with-dlz-ldap=yes' '--with-dlz-stub=yes' '--with-geoip=/usr' '--enable-ipv6' 'CFLAGS=-fno-strict-aliasing -DDIG_SIGCHASE -O2' 'LDFLAGS=-Wl,-Bsymbolic-functions' 'CPPFLAGS=' Jul 27 16:53:49 conf002 named[3988]: adjusted limit on open files from 4096 to 1048576 Jul 27 16:53:49 conf002 named[3988]: found 4 CPUs, using 4 worker threads Jul 27 16:53:49 conf002 named[3988]: using up to 4096 sockets Jul 27 16:53:49 conf002 named[3988]: loading configuration from '/etc/bind/named.conf' Jul 27 16:53:49 conf002 named[3988]: none:0: open: /etc/bind/named.conf: permission denied Jul 27 16:53:49 conf002 named[3988]: loading configuration: permission denied Jul 27 16:53:49 conf002 named[3988]: exiting (due to fatal error) Jul 27 16:53:49 conf002 kernel: [74323.514875] type=1400 audit(1343433229.352:108): apparmor="DENIED" operation="open" parent=3987 profile="/usr/sbin/named" name="/var/lib/named/etc/bind/named.conf" pid=3992 comm="named" requested_mask="r" denied_mask="r" fsuid=103 ouid=103 Looks like the process can not read the file /var/lib/named/etc/bind/named.conf. I have made sure that the owner of this file is user bind, and it has the read/write access to it: root@test:/var/lib/named/etc/bind# ls -atl total 64 drwxr-xr-x 3 bind bind 4096 2012-07-27 16:35 .. drwxrwsrwx 2 bind bind 4096 2012-07-27 15:26 zones drwxr-sr-x 3 bind bind 4096 2012-07-26 21:36 . -rw-r--r-- 1 bind bind 666 2012-07-26 21:33 named.conf.options -rw-r--r-- 1 bind bind 514 2012-07-26 21:18 named.conf.local -rw-r----- 1 bind bind 77 2012-07-25 00:25 rndc.key -rw-r--r-- 1 bind bind 2544 2011-07-14 06:31 bind.keys -rw-r--r-- 1 bind bind 237 2011-07-14 06:31 db.0 -rw-r--r-- 1 bind bind 271 2011-07-14 06:31 db.127 -rw-r--r-- 1 bind bind 237 2011-07-14 06:31 db.255 -rw-r--r-- 1 bind bind 353 2011-07-14 06:31 db.empty -rw-r--r-- 1 bind bind 270 2011-07-14 06:31 db.local -rw-r--r-- 1 bind bind 2994 2011-07-14 06:31 db.root -rw-r--r-- 1 bind bind 463 2011-07-14 06:31 named.conf -rw-r--r-- 1 bind bind 490 2011-07-14 06:31 named.conf.default-zones -rw-r--r-- 1 bind bind 1317 2011-07-14 06:31 zones.rfc1918 What could be wrong here?

Read the article
Unable to mount XP share using fs-cifs from Linux

- by MetalSearGolid

I have a head unit that runs Linux that is connected to my PC via an Ethernet cable. I have a Windows XP share on this PC that the head unit needs to be able to mount, however, when mounting using the following command, it fails. Here is the command that fails, along with the verbose output: # fs-cifs -vvvvvvvvv -l //CUMBRIA-XP:192.168.1.2:/hnet /mnt/net cifs[2158679-1]: starting... cifs[2158679-1]: user is to input both name & passwd. cifs[2158679-1]: server [192.168.1.2] share [hnet] prefix [/mnt/net] user [nu ll] passwd [null] Welcome: 192.168.1.2(:/hnet) -> /mnt/net Username:headunit cifs[2158679-1]: user name: headunit length 8 cifs[2158679-1]: new server Password: cifs[2158679-1]: establishing connection to (192.168.1.2)CUMBRIA-XP cifs[2158679-1]: session request: 192.168.1.2:CUMBRIA-XP -> localhost cifs[2158679-1]: negotiating smb dialect cifs[2158679-1]: skey(idx=2): 00000000, challenge:(8), 6137bfa2 f2d7803b cifs[2158679-1]: negotiation: success with dialect=2 cifs[2158679-1]: logging headunit on 192.168.1.2 cifs[2158679-1]: new packet cifs[2158679-1]: returning: mid 0 status= 0 cifs[2158679-1]: smb_logon successful: dialect 2 enpass 1 cifs[2158679-1]: mounting 192.168.1.2:/hnet cifs[2158679-1]: returning: mid 1 status= 13 cifs[2158679-1]: smb_mount: Bad file descriptor cifs[2158679-1]: try upper case share. cifs[2158679-1]: session request: 192.168.1.2:CUMBRIA-XP -> localhost cifs[2158679-1]: negotiating smb dialect cifs[2158679-1]: skey(idx=2): 00000000, challenge:(8), 2d3e910f e3e148c4 cifs[2158679-1]: negotiation: success with dialect=2 cifs[2158679-1]: logging headunit on 192.168.1.2 cifs[2158679-1]: returning: mid 2 status= 0 cifs[2158679-1]: smb_logon successful: dialect 2 enpass 1 cifs[2158679-1]: mounting 192.168.1.2:/HNET cifs[2158679-1]: returning: mid 3 status= 13 cifs[2158679-1]: smb_mount: Bad file descriptor cifs[2158679-1]: mount failed. cifs[2158679-1]: io_mount: smb_connection failed: Bad file descriptor io_mount: Bad file descriptor cifs[2158679-1]: user is to input both name & passwd. fs-cifs: missing arguments, or all mount attempts failed. run "use fs-cifs" or "fs-cifs -h" for help. Any ideas? It is worthy to note that /mnt does not exist on the filesystem, but I was told by the company who gave us these units that fs-cifs should automatically create the /mnt/net folders if they don't exist.

Read the article
IRP_MJ_WRITE latency up to 15 seconds

- by racitup

We have written an application that performs small (22kB) writes to multiple files at once (one thread performing asynchronous queued writes to multiple locations on behalf of other threads) on the same local volume (RAID1). 99.9% of the writes are low-latency but occasionally (maybe every minute or two) we get one or two huge latency writes (I have seen 10 seconds and above) without any real explanation. Platform: Win2003 Server with NTFS. Monitoring: Sysinternals Process Monitor (see link below) and our own application logging. We have tried multiple things to try and solve this that have been gleaned from a few websites, e.g.: Making the first part of file names unique to aid 8.3 name generation Writing files to multiple directories Changing Intel Disk Write Caching Windows File/Printer Sharing Minimize memory used Balance Maximize data throughput for file sharing Maximize data throughput for network applications System-Advanced-Performance-Advanced NtfsDisableLastAccessUpdate - use fsutil behavior set disablelastaccess 1 disable 8.3 name generation - use "fsutil behavior set disable8dot3 1" + restart Enable a large size file system cache Disable paging of the kernel code IO Page Lock Limit Turn Off (or On) the Indexing Service But nothing seems to make much difference. There's a whole host of things we haven't tried yet but we wondered if anyone had come across the same problem, a reason and a solution (programmatic or not)? We can reproduce the problem using IOMeter and a simple setup: Start IOMeter and remove all but the first worker thread in 'Topology' using the disconnect button. Select the Worker thread and put a cross in the box next to the disk you want to use in the Disk Targets tab and put '2000000' in Maximum Disk Size (NOTE: must have at least 1GB free space; sector size is 512 bytes) Next create a new access specification and add it to the worker thread: Transfer Request Size = 22kB 100% Sequential Percent of Access Spec = 100% Percent Read/Write = 100% Write Change Results Display Update Frequency to 5 seconds, Test Setup Run Time to 20 seconds and both 'Number of Workers to Spawn Automatically' settings to zero. Select the Worker Thread in the Topology panel and hit the Duplicate Worker button 59 times to create 60 threads with identical settings. Hit the 'Go' button (green flag) and monitor the Results tab. The 'Maximum I/O Response Time (ms)' always hits at least 3500 on our machine. Our machine isn't exactly slow (Xeon 8 core rack server with 4GB and onboard RAID). I'd be interested to see what other people get. We have a feeling it might be something to do with the NTFS filesystem (ours is currently 75% full of fragmented files) and we are going to try a few things around this principle. But it is also related to disk performance since we don't see it on a RAMDisk and it's not as severe on a RAID10 array. Any help is much appreciated. Richard Right-click and select 'Open Link in New Tab': ProcMon Result

Read the article
Bidirectional real-time sync of large file tree between two distant linux servers

- by dlo

By large file tree I mean about 200k files, and growing all the time. A relatively small number of files are being changed in any given hour though. By bidirectional I mean that changes may occur on either server and need to be pushed to the other, so rsync doesn't seem appropriate. By distant I mean that the servers are both in data centers, but geographically remote from each other. Currently there are only 2 servers, but that may expand over time. By real-time, it's ok for there to be a little latency between syncing, but running a cron every 1-2 minutes doesn't seem right, since a very small fraction of files may change in any given hour, let alone minute. EDIT: This is running on VPS's so I might be limited on the kinds of kernel-level stuff I can do. Also, the VPS's are not resource-rich, so I'd shy away from solutions that require lots of ram (like Gluster?). What's the best / most "accepted" approach to get this done? This seems like it would be a common need, but I haven't been able to find a generally accepted approach yet, which was surprising. (I'm seeking the safety of the masses. :) I've come across lsyncd to trigger a sync at the filesystem change level. That seems clever though not super common, and I'm a bit confused by the various lsyncd approaches. There's just using lsyncd with rsync, but it seems this could be fragile for bidirectionality since rsync doesn't have a notion of memory (eg- to know whether a deleted file on A should be deleted on B or whether it's a new file on B that should be copied to A). lipsync appears to be just a lsyncd+rsync implementation, right? Then there's using lsyncd with csync2, like this: http://www.axivo.com/community/threads/lightning-fast-synchronization-with-csync2-and-lsyncd.121/ ... I'm leaning towards this approach, but csync2 is a little quirky, though I did do a successful test of it. I'm mostly concerned that I haven't been able to find a lot of community confirmation of this method. People on here seem to like Unison a lot, but it seems that it is no longer under active development and it's not clear that it has an automatic trigger like lsyncd. I've seen Gluster mentioned, but maybe overkill for what I need? UPDATE: fyi- I ended up going with the original solution I mentioned: lsyncd+csync2. It seems to work quite well, and I like the architectural approach of having the servers be very loosely joined, so that each server can operate indefinitely on its own regardless of the link quality between them.

Read the article
How to create a Windows 7 installation usb media from linux ? (to install Windows 7) - Help need to know better method

- by Abel Coto

I have been reading some web pages and posts here and in other forums about how to create a Windows 7 installation Usb media (to install windows 7 using a usb) from linux. I asked in technet about this , and they give me general ideas about how to do it I personally am not very familiar with linux, but basicaly all that you need to do... in whatever way you do it is the following: Format a usb flash drive, either fat32 or ntfs create a partition that is large enough to host the windows installation (give or take 3GB for 64bit, aroudn 2.5gb for 32bit) and mark that partition as active/bootable. Since this can be done with windows, but just as well with a tool like gparted, you should be able to do the same in debian. Once you have created that partition, mount the iso that you download, and copy all files starting from the root, into the root of the usb flash drive. That's all there's to it. There is a method that i found in various places,that is almost the same that the man of technet has said. But,there is a step,that in that method is done,that i don't know if it is really necessary,or not. Not allways dd works.Basically, the missing step was to write a proper boot sector to the usb stick, which can be done from linux with ms-sys. This works with the Win7 retail version. Here is the complete rundown again: Install ms-sys Check what device your usb media is asigned - here we will assume it is /dev/sdb. Delete all partitions, create a new one taking up all the space, set type to NTFS, and set it bootable: *# cfdisk /dev/sdb* Create NTFS filesystem: *# mkfs.ntfs -f /dev/sdb1* Mount iso and usb media: *# mount -o loop win7.iso /mnt/iso # mount /dev/sdb1 /mnt/usb* Copy over all files: *# cp -r /mnt/iso/* /mnt/usb/* Write Windows 7 MBR on usb stick: *# ms-sys -7 /dev/sdb* ...and you're done. Shouldn't the usb work without doing the last step "# ms-sys -7 /dev/sdb" or to make the usb bootable , is a must , not only to mark the partition as bootable ? Would be better use rsync instead of cp -r ? All this steps should be done as root, i suppose , or if not , chmod to 664 and chown the directories where are mounted the usb and the iso, no ? But i suppose that the easier thing is to copy the data as root , and that this will not affect to the data. Has anyone tried this method or some similar like copying the iso with dd ?

Read the article
Autoloading Development or Production configs (best practices)

- by Xeoncross

When programming sites you usually have one set of config files for the development environment and another set for the production server (or one file with both settings). I am assuming all projects should be handled by version control like git or svn. Manual file transfers (like FTP) is wrong on so many levels. How you enable/disable the correct settings (so that your system knows which ones to use) is a problem for me. Each system I work on just kind of jimmy-rigs a solution. Below are the 3 methods I know of and I am hoping that someone can submit a more elegant solutions. 1) File Based The system loads a folder structure based on the URL requested. /site.com /site.fakeTLD /lib index.php For example, if the url is http://site.com then the system loads the production config files located in the site.com folder. However, if I'm working on the site locally I visit http://site.fakeTLD to work on the local copy of the site. To setup this I edit my hosts file and add site.fakeTLD to point to my own computer (127.0.0.1/localhost) and then create a vhost in apache. So now I can work on the codebase locally and then push to the server without any trouble. The problem is that this is susceptible to a "host" injection attack. So someone loading site.com could set the host to site.fakeTLD and then the system would load my development config files instead of production. 2) Config Based The config files contain on section for development - and one for production. The problem is that each time you go to push your changes to the repo you have to edit the file to specify which set of config options should be used. $use = 'production'; //'development'; This leaves the repo open to human error should one of the developers forget to enable the right setting. 3) File System Check Based All the development machines have an extra empty file called "development.txt" or something. Each time the system loads it checks for this file - if found then it knows it is in development mode - if missing then it knows it is in production mode. Since the file is NEVER ADDED to the repo then it will never be pushed (and checked out) on the production machine. However, this just doesn't feel right and causes a slight slow down since all filesystem checks are slow.

Read the article
Solaris10 x86 mirror. Making second disk booteable when failure

- by Kani

Did a mirror (RAID1) with Solaris 10 in x86. Everything OK. Now, I´m trying to make the second disk booteable, this is: from grub or in case of failure of disk1. I edited /boot/grub/menu.lst: #---------- ADDED BY BOOTADM - DO NOT EDIT ---------- title Solaris 10 9/10 s10x_u9wos_14a X86 findroot (rootfs1,0,a) kernel /platform/i86pc/multiboot module /platform/i86pc/boot_archive #---------------------END BOOTADM-------------------- #---------- ADDED BY BOOTADM - DO NOT EDIT ---------- title Solaris failsafe findroot (rootfs1,0,a) kernel /boot/multiboot -s module /boot/amd64/x86.miniroot-safe #---------------------END BOOTADM-------------------- #---------- ADDED BY BOOTADM - DO NOT EDIT ---------- title Solaris failsafe findroot (rootfs1,0,a) kernel /boot/multiboot kernel/unix -s module /boot/x86.miniroot-safe #---------------------END BOOTADM-------------------- #Make second disk booteable!!!!!!! title alternate boot findroot (rootfs1,1,a) kernel /boot/multiboot kernel/unix -s module /boot/x86.miniroot-safe But is not working. In the BIOS, when I select "alternate boot" I get: Error 15: 15 file not found also, how to configure to GRUB to make the disk2 to boot in case of error in disk1? Additionally, I did (but not related to GRUB): eeprom altbootpath=/devices/pci@0,0/pci108e,5352@1f,2/disk@1,0:a Here is the output of some commands that may help you: /sbin/biosdev 0x80 /pci@0,0/pci108e,5352@1f,2/disk@0,0 0x81 /pci@0,0/pci108e,5352@1f,2/disk@1,0 ls -l /dev/dsk/c1t?d0s0 lrwxrwxrwx 1 root root 50 Jul 7 12:01 /dev/dsk/c1t0d0s0 -> ../../devices/pci@0,0/pci108e,5352@1f,2/disk@0,0:a lrwxrwxrwx 1 root root 50 Jul 7 12:01 /dev/dsk/c1t1d0s0 -> ../../devices/pci@0,0/pci108e,5352@1f,2/disk@1,0:a more /boot/solaris/bootenv.rc setprop ata-dma-enabled '1' setprop atapi-cd-dma-enabled '0' setprop ttyb-rts-dtr-off 'false' setprop ttyb-ignore-cd 'true' setprop ttya-rts-dtr-off 'false' setprop ttya-ignore-cd 'true' setprop ttyb-mode '9600,8,n,1,-' setprop ttya-mode '9600,8,n,1,-' setprop lba-access-ok '1' setprop prealloc-chunk-size '0x2000' setprop bootpath '/pci@0,0/pci108e,5352@1f,2/disk@0,0:a' setprop keyboard-layout 'US-English' setprop console 'text' setprop altbootpath '/pci@0,0/pci108e,5352@1f,2/disk@1,0:a' cat /etc/vfstab #device device mount FS fsck mount mount #to mount to fsck point type pass at boot options # fd - /dev/fd fd - no - /proc - /proc proc - no - #/dev/dsk/c1t0d0s1 - - swap - no - /dev/md/dsk/d1 - - swap - no - /dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no - /devices - /devices devfs - no - sharefs - /etc/dfs/sharetab sharefs - no - ctfs - /system/contract ctfs - no - objfs - /system/object objfs - no - swap - /tmp tmpfs - yes - df -h Filesystem size used avail capacity Mounted on /dev/md/dsk/d0 909G 11G 889G 2% / /devices 0K 0K 0K 0% /devices ctfs 0K 0K 0K 0% /system/contract proc 0K 0K 0K 0% /proc mnttab 0K 0K 0K 0% /etc/mnttab swap 14G 972K 14G 1% /etc/svc/volatile objfs 0K 0K 0K 0% /system/object sharefs 0K 0K 0K 0% /etc/dfs/sharetab /usr/lib/libc/libc_hwcap1.so.1 909G 11G 889G 2% /lib/libc.so.1 fd 0K 0K 0K 0% /dev/fd swap 14G 40K 14G 1% /tmp swap 14G 28K 14G 1% /var/run

Read the article
Ubuntu hard disk problem

- by Henadzy

Hello! I have got the error with a hard disk on Ubuntu 9.10. It slows down my system, applications have not been responding for a long time. But when I mount and use filesystem which placed on this hard disk at other computer it works properly. disk: SAMSUNG HD161HJ (SATA) syslog: Apr 25 00:28:25 vare6gin kernel: [ 885.773839] ata3.00: exception Emask 0x1 SAct 0x1e SErr 0x0 action 0x6 frozen Apr 25 00:28:25 vare6gin kernel: [ 885.773845] ata3.00: Ata error. fis:0x21 Apr 25 00:28:25 vare6gin kernel: [ 885.773861] ata3.00: cmd 60/08:08:3f:00:ad/00:00:10:00:00/40 tag 1 ncq 4096 in Apr 25 00:28:25 vare6gin kernel: [ 885.773864] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773871] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773877] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773890] ata3.00: cmd 60/18:10:9f:6b:ed/00:00:0e:00:00/40 tag 2 ncq 12288 in Apr 25 00:28:25 vare6gin kernel: [ 885.773893] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773900] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773904] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773918] ata3.00: cmd 60/08:18:3f:5f:ed/00:00:0e:00:00/40 tag 3 ncq 4096 in Apr 25 00:28:25 vare6gin kernel: [ 885.773921] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773927] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773932] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773946] ata3.00: cmd 60/08:20:67:c8:91/00:00:05:00:00/40 tag 4 ncq 4096 in Apr 25 00:28:25 vare6gin kernel: [ 885.773948] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773955] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773960] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773970] ata3: hard resetting link Apr 25 00:28:25 vare6gin kernel: [ 885.773974] ata3: nv: skipping hardreset on occupied port Apr 25 00:28:25 vare6gin kernel: [ 886.240073] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Apr 25 00:28:25 vare6gin kernel: [ 886.256277] ata3.00: configured for UDMA/133 Apr 25 00:28:25 vare6gin kernel: [ 886.256305] ata3: EH complete Apr 25 00:28:27 vare6gin kernel: [ 888.176088] ata3: EH in SWNCQ mode,QC:qc_active 0xF sactive 0xF Apr 25 00:28:27 vare6gin kernel: [ 888.176099] ata3: SWNCQ:qc_active 0xF defer_bits 0x0 last_issue_tag 0x3 Apr 25 00:28:27 vare6gin kernel: [ 888.176102] dhfis 0xF dmafis 0x1 sdbfis 0x0 Apr 25 00:28:27 vare6gin kernel: [ 888.176109] ata3: ATA_REG 0x51 ERR_REG 0x40 Apr 25 00:28:27 vare6gin kernel: [ 888.176113] ata3: tag : dhfis dmafis sdbfis sacitve Apr 25 00:28:27 vare6gin kernel: [ 888.176120] ata3: tag 0x0: 1 1 0 1 Apr 25 00:28:27 vare6gin kernel: [ 888.176126] ata3: tag 0x1: 1 0 0 1 Apr 25 00:28:27 vare6gin kernel: [ 888.176131] ata3: tag 0x2: 1 0 0 1 Apr 25 00:28:27 vare6gin kernel: [ 888.176136] ata3: tag 0x3: 1 0 0 1

Read the article
Ubuntu hard disk problem

- by Henadzy

Hello! I have got the error with a hard disk on Ubuntu 9.10. It slows down my system, applications have not been responding for a long time. But when I mount and use filesystem which placed on this hard disk at other computer it works properly. disk: SAMSUNG HD161HJ (SATA) syslog: Apr 25 00:28:25 vare6gin kernel: [ 885.773839] ata3.00: exception Emask 0x1 SAct 0x1e SErr 0x0 action 0x6 frozen Apr 25 00:28:25 vare6gin kernel: [ 885.773845] ata3.00: Ata error. fis:0x21 Apr 25 00:28:25 vare6gin kernel: [ 885.773861] ata3.00: cmd 60/08:08:3f:00:ad/00:00:10:00:00/40 tag 1 ncq 4096 in Apr 25 00:28:25 vare6gin kernel: [ 885.773864] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773871] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773877] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773890] ata3.00: cmd 60/18:10:9f:6b:ed/00:00:0e:00:00/40 tag 2 ncq 12288 in Apr 25 00:28:25 vare6gin kernel: [ 885.773893] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773900] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773904] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773918] ata3.00: cmd 60/08:18:3f:5f:ed/00:00:0e:00:00/40 tag 3 ncq 4096 in Apr 25 00:28:25 vare6gin kernel: [ 885.773921] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773927] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773932] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773946] ata3.00: cmd 60/08:20:67:c8:91/00:00:05:00:00/40 tag 4 ncq 4096 in Apr 25 00:28:25 vare6gin kernel: [ 885.773948] res 51/40:24:67:c8:91/40:00:05:00:00/40 Emask 0x9 (media error) Apr 25 00:28:25 vare6gin kernel: [ 885.773955] ata3.00: status: { DRDY ERR } Apr 25 00:28:25 vare6gin kernel: [ 885.773960] ata3.00: error: { UNC } Apr 25 00:28:25 vare6gin kernel: [ 885.773970] ata3: hard resetting link Apr 25 00:28:25 vare6gin kernel: [ 885.773974] ata3: nv: skipping hardreset on occupied port Apr 25 00:28:25 vare6gin kernel: [ 886.240073] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Apr 25 00:28:25 vare6gin kernel: [ 886.256277] ata3.00: configured for UDMA/133 Apr 25 00:28:25 vare6gin kernel: [ 886.256305] ata3: EH complete Apr 25 00:28:27 vare6gin kernel: [ 888.176088] ata3: EH in SWNCQ mode,QC:qc_active 0xF sactive 0xF Apr 25 00:28:27 vare6gin kernel: [ 888.176099] ata3: SWNCQ:qc_active 0xF defer_bits 0x0 last_issue_tag 0x3 Apr 25 00:28:27 vare6gin kernel: [ 888.176102] dhfis 0xF dmafis 0x1 sdbfis 0x0 Apr 25 00:28:27 vare6gin kernel: [ 888.176109] ata3: ATA_REG 0x51 ERR_REG 0x40 Apr 25 00:28:27 vare6gin kernel: [ 888.176113] ata3: tag : dhfis dmafis sdbfis sacitve Apr 25 00:28:27 vare6gin kernel: [ 888.176120] ata3: tag 0x0: 1 1 0 1 Apr 25 00:28:27 vare6gin kernel: [ 888.176126] ata3: tag 0x1: 1 0 0 1 Apr 25 00:28:27 vare6gin kernel: [ 888.176131] ata3: tag 0x2: 1 0 0 1 Apr 25 00:28:27 vare6gin kernel: [ 888.176136] ata3: tag 0x3: 1 0 0 1

Read the article
How to remove iso 9660 from USB?

- by a_m0d

I have somehow managed to write an iso 9660 image onto my USB drive, which makes all my computer think that the device is actually a CD. I have tried various methods of removing this partition, but nothing seems to work. I have tried fdisk, which says $ fdisk -l /dev/sdb Cannot open /dev/sdb parted crashes when I try to use it on this device. I have even tried $ dd if=/dev/zero of=/dev/sdb but it just hangs with no output (either on screen or on disk). However, when I plug the USB in, it does mount, and I can view (but not edit) the files on it. edit: now the result is $ dd if=/dev/zero of=/dev/sdb dd: opening `/dev/sdb': Read-only file system I have also tried re-formatting it on Windows, but it gets to the end of the format process and then says "Couldn't format the drive". How can I remove this partition and get my whole USB drive back to normal again? EDIT 1: Trying a simple mkfs doesn't work: $ sudo mkfs -t vfat /dev/sdb mkfs.vfat 3.0.0 (28 Sep 2008) mkfs.vfat: Will not try to make filesystem on full-disk device '/dev/sdb' (use -I if wanted) I can't do mkfs on /dev/sdb1 because there is no such partition, as shown:$ ls /dev | grep sdb sdb EDIT 2: This is the information posted by dmesg when I plug the device in:$ dmesg . . (snip) . usb 2-1: New USB device found, idVendor=058f, idProduct=6387 usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 2-1: Product: Mass Storage usb 2-1: Manufacturer: Generic usb 2-1: SerialNumber: G0905000000000010885 usb-storage: device found at 4 usb-storage: waiting for device to settle before scanning usb-storage: device scan complete scsi 6:0:0:0: Direct-Access FLASH Drive AU_USB20 8.07 PQ: 0 ANSI: 2 sd 6:0:0:0: [sdb] 4069376 512-byte hardware sectors (2084 MB) sd 6:0:0:0: [sdb] Write Protect is off sd 6:0:0:0: [sdb] Mode Sense: 03 00 00 00 sd 6:0:0:0: [sdb] Assuming drive cache: write through sd 6:0:0:0: [sdb] 4069376 512-byte hardware sectors (2084 MB) sd 6:0:0:0: [sdb] Write Protect is off sd 6:0:0:0: [sdb] Mode Sense: 03 00 00 00 sd 6:0:0:0: [sdb] Assuming drive cache: write through sdb: unknown partition table sd 6:0:0:0: [sdb] Attached SCSI removable disk sd 6:0:0:0: Attached scsi generic sg2 type 0 ISO 9660 Extensions: Microsoft Joliet Level 3 ISO 9660 Extensions: RRIP_1991A SELinux: initialized (dev sdb, type iso9660), uses genfs_contexts CE: hpet increasing min_delta_ns to 15000 nsec This shows that the device is formatted as ISO 9660 and that it is /dev/sdb. EDIT 3: This is the message that I find at the bottom of dmesg after running cfdisk and writing a new partition table to the disk:SELinux: initialized (dev sdb, type iso9660), uses genfs_contexts sd 17:0:0:0: [sdb] Device not ready: Sense Key : Not Ready [current] sd 17:0:0:0: [sdb] Device not ready: < ASC=0xff ASCQ=0xffASC=0xff < ASCQ=0xff end_request: I/O error, dev sdb, sector 0 Buffer I/O error on device sdb, logical block 0 lost page write due to I/O error on sdb

Read the article
Linux RAID-0 performance doesn't scale up over 1 GB/s

- by wazoox

I have trouble getting the max throughput out of my setup. The hardware is as follow : dual Quad-Core AMD Opteron(tm) Processor 2376 16 GB DDR2 ECC RAM dual Adaptec 52245 RAID controllers 48 1 TB SATA drives set up as 2 RAID-6 arrays (256KB stripe) + spares. Software : Plain vanilla 2.6.32.25 kernel, compiled for AMD-64, optimized for NUMA; Debian Lenny userland. benchmarks run : disktest, bonnie++, dd, etc. All give the same results. No discrepancy here. io scheduler used : noop. Yeah, no trick here. Up until now I basically assumed that striping (RAID 0) several physical devices should augment performance roughly linearly. However this is not the case here : each RAID array achieves about 780 MB/s write, sustained, and 1 GB/s read, sustained. writing to both RAID arrays simultaneously with two different processes gives 750 + 750 MB/s, and reading from both gives 1 + 1 GB/s. however when I stripe both arrays together, using either mdadm or lvm, the performance is about 850 MB/s writing and 1.4 GB/s reading. at least 30% less than expected! running two parallel writer or reader processes against the striped arrays doesn't enhance the figures, in fact it degrades performance even further. So what's happening here? Basically I ruled out bus or memory contention, because when I run dd on both drives simultaneously, aggregate write speed actually reach 1.5 GB/s and reading speed tops 2 GB/s. So it's not the PCIe bus. I suppose it's not the RAM. It's not the filesystem, because I get exactly the same numbers benchmarking against the raw device or using XFS. And I also get exactly the same performance using either LVM striping and md striping. What's wrong? What's preventing a process from going up to the max possible throughput? Is Linux striping defective? What other tests could I run?

Read the article
SAS Expanders vs Direct Attached (SAS)?

- by jemmille

I have a storage unit with 2 backplanes. One backplane holds 24 disks, one backplane holds 12 disks. Each backplane is independently connected to a SFF-8087 port (4 channel/12Gbit) to the raid card. Here is where my question really comes in. Can or how easily can a backplane be overloaded? All the disks in the machine are WD RE4 WD1003FBYX (black) drives that have average writes at 115MB/sec and average read of 125 MB/sec I know things would vary based on the raid or filesystem we put on top of that but it seems to be that a 24 disk backplane with only one SFF-8087 connector should be able to overload the bus to a point that might actually slow it down? Based on my math, if I had a RAID0 across all 24 disks and asked for a large file, I should, in theory should get 24*115 MB/sec wich translates to 22.08 GBit/sec of total throughput. Either I'm confused or this backplane is horribly designed, at least in a perfomance environment. I'm looking at switching to a model where each drive has it's own channel from the backplane (and new HBA's or raid card). EDIT: more details We have used both pure linux (centos), open solaris, software raid, hardware raid, EXT3/4, ZFS. Here are some examples using bonnie++ 4 Disk RAID-0, ZFS WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS 194MB/s 19% 92MB/s 11% 200MB/s 8% 310/sec 194MB/s 19% 93MB/s 11% 201MB/s 8% 312/sec --------- ---- --------- ---- --------- ---- --------- 389MB/s 19% 186MB/s 11% 402MB/s 8% 311/sec 8 Disk RAID-0, ZFS WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS 324MB/s 32% 164MB/s 19% 346MB/s 13% 466/sec 324MB/s 32% 164MB/s 19% 348MB/s 14% 465/sec --------- ---- --------- ---- --------- ---- --------- 648MB/s 32% 328MB/s 19% 694MB/s 13% 465/sec 12 Disk RAID-0, ZFS WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS 377MB/s 38% 191MB/s 22% 429MB/s 17% 537/sec 376MB/s 38% 191MB/s 22% 427MB/s 17% 546/sec --------- ---- --------- ---- --------- ---- --------- 753MB/s 38% 382MB/s 22% 857MB/s 17% 541/sec Now 16 Disk RAID-0, it's gets interesting WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS 359MB/s 34% 186MB/s 22% 407MB/s 18% 1397/sec 358MB/s 33% 186MB/s 22% 407MB/s 18% 1340/sec --------- ---- --------- ---- --------- ---- --------- 717MB/s 33% 373MB/s 22% 814MB/s 18% 1368/sec 20 Disk RAID-0, ZFS WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS 371MB/s 37% 188MB/s 22% 450MB/s 19% 775/sec 370MB/s 37% 188MB/s 22% 447MB/s 19% 797/sec --------- ---- --------- ---- --------- ---- --------- 741MB/s 37% 376MB/s 22% 898MB/s 19% 786/sec 24 Disk RAID-1, ZFS WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS 347MB/s 34% 193MB/s 22% 447MB/s 19% 907/sec 347MB/s 34% 192MB/s 23% 446MB/s 19% 933/sec --------- ---- --------- ---- --------- ---- --------- 694MB/s 34% 386MB/s 22% 894MB/s 19% 920/sec 28 Disk RAID-0, ZFS 32 Disk RAID-0, ZFS 36 Disk RAID-0, ZFS More details: Here is the exact unit: http://www.supermicro.com/products/chassis/4U/847/SC847E1-R1400U.cfm

Read the article
RAID 50 24Port Fast Writes Slow Reads - Ubuntu

- by James

What is going on here?! I am baffled. serveradmin@FILESERVER:/Volumes/MercuryInternal/test$ sudo dd if=/dev/zero of=/Volumes/MercuryInternal/test/test.fs bs=4096k count=10000 10000+0 records in 10000+0 records out 41943040000 bytes (42 GB) copied, 57.0948 s, 735 MB/s serveradmin@FILESERVER:/Volumes/MercuryInternal/test$ sudo dd if=/Volumes/MercuryInternal/test/test.fs of=/dev/null bs=4096k count=10000 10000+0 records in 10000+0 records out 41943040000 bytes (42 GB) copied, 116.189 s, 361 MB/s OF NOTE: My RAID50 is 3 sets of 8 disks. - This might not be the best config for SPEED. OS: Ubuntu 12.04.1 x64 Hardware Raid: RocketRaid 2782 - 24 Port Controller HardDriveType: Seagate Barracuda ES.2 1TB Drivers: v1.1 Open Source Linux Drivers. So 24 x 1TB drives, partitioned using parted. Filesystem is ext4. I/O scheduler WAS noop but have changed it to deadline with no seemingly performance benefit/cost. serveradmin@FILESERVER:/Volumes/MercuryInternal/test$ sudo gdisk -l /dev/sdb GPT fdisk (gdisk) version 0.8.1 Partition table scan: MBR: protective BSD: not present APM: not present GPT: present Found valid GPT with protective MBR; using GPT. Disk /dev/sdb: 41020686336 sectors, 19.1 TiB Logical sector size: 512 bytes Disk identifier (GUID): 95045EC6-6EAF-4072-9969-AC46A32E38C8 Partition table holds up to 128 entries First usable sector is 34, last usable sector is 41020686302 Partitions will be aligned on 2048-sector boundaries Total free space is 5062589 sectors (2.4 GiB) Number Start (sector) End (sector) Size Code Name 1 2048 41015625727 19.1 TiB 0700 primary To me this should be working fine. I can't think of anything that would be causing this other then fundamental driver errors? I can't seem to get much/if any higher then the 361MB a second, is this hitting the "SATA2" link speed, which it shouldn't given it is a PCIe2.0 card. Or maybe some cacheing quirk - I do have Write Back enabled. Does anyone have any suggestions? Tests for me to perform? Or if you require more information, I am happy to provide it! This is a video fileserver for editing machines, so we have a preference for FAST reads over writes. I was just expected more from RAID 50 and 24 drives together... EDIT: (hdparm results) serveradmin@FILESERVER:/Volumes/MercuryInternal$ sudo hdparm -Tt /dev/sdb /dev/sdb: Timing cached reads: 17458 MB in 2.00 seconds = 8735.50 MB/sec Timing buffered disk reads: 884 MB in 3.00 seconds = 294.32 MB/sec EDIT2: (config details) Also, I am using a RAID block size of 256K. I was told a larger block size is better for larger (in my case large video) files. EDIT3: (Bonnie++ Results. Would love some guidance with this!)

Read the article
How can I repair my USB drive?

- by yurko

USB drive is in read only state and I can't repair it. First of all I tried erase it using dd: root@yurko-laptop:/home/yurko-laptop# ls -l /dev/disk/by-id | grep usb lrwxrwxrwx 1 root root 9 ??? 18 23:45 usb-Generic_Flash_Disk_C173828A-0:0 -> ../../sdb lrwxrwxrwx 1 root root 10 ??? 18 23:45 usb-Generic_Flash_Disk_C173828A-0:0-part1 -> ../../sdb1 root@yurko-laptop:/home/yurko-laptop# dd if=/dev/zero of=/dev/sdb dd: ?????? ? «/dev/sdb»: ?? ?????????? ????????? ????? 8257537+0 ??????? ??????? 8257536+0 ??????? ???????? ??????????? 4227858432 ????? (4,2 GB), 942,633 c, 4,5 MB/c After that I wanted to create new filesystem using fdisk: root@yurko-laptop:/home/yurko-laptop# fdisk /dev/sdb You will not be able to write the partition table. WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): p Disk /dev/sdb: 4227 MB, 4227858432 bytes 4 heads, 63 sectors/track, 32768 cylinders Units = cylinders of 252 * 512 = 129024 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 18 32768 4126596 b W95 FAT32 Command (m for help): fdisk showed that the partition still exists and I can't write the partition table. I tried to delete the existing partition: Command (m for help): d Selected partition 1 Command (m for help): w Unable to write /dev/sdb root@yurko-laptop:/home/yurko-laptop# Why am I not be able to write the partition table? Does it mean that some hardware failure occurred? And is it possible to repair the current USB drive? I've tried to use hdparm and it showed that the readonly flag is on: root@yurko-laptop:/home/yurko-laptop# hdparm /dev/sdb /dev/sdb: SG_IO: bad/missing sense data, sb[]: f0 00 05 00 00 00 00 0a 00 00 00 00 26 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 multcount = 0 (off) readonly = 1 (on) readahead = 256 (on) geometry = 1016/131/62, sectors = 8257536, start = 0

Read the article
Windows 7 The boot selection failed because a required device is inaccessible 0xc000000f

- by piratejackus

I have a problem with my Windows 7, hardware : Acer 3820TG Operating Systems : Windows 7 and Ubuntu 10.04 dual Case: When I try to boot my windows 7 I see an error: "Window failed to start. A recent hardware or software change might be the cause. To fix the problem: 1.Insert.... 2. .... ... status : 0xc000000f info : The boot selection failed because a required device is inaccessible .... " I can't exactly remember what were my last actions on Windows. I already searched this error and applied the proposed solutions, I created a repair USB (because I don't have a CD-ROM nor a Windows 7 CD) such as; -repair operating system :it says it cannot repair it -checking disk (chkdsk D: /f /r) : it checks the disk without a problem or error and it takes pretty long (more than a hour). But when I restart, still the same error. -I didn't create a restore point so I pass this option -I don't have a system image -I tried to run windows recovery (I have a recovery partition) but there are just two options: 1- Format the operating system but retain user data (copies the files under users to c\backup folder, but when I searched deeper I found that there are some people who already tried this option and couldn't find their user files under backup directory). Plus, I have unfortunately just one partition D (it is a fault I know) because I use always Ubuntu. So this is not applicable in my situation 2- Format entire system (Windows). I keep my valuable data in windows but not in user folder. I was reaching them from Windows. -I tried to repair windows boot by: bootrec /fixMBR bootrec /fixBoot bootrec /rebuildBCD I lost all grub menu, and reinstalled it. - ubuntuforums.org/showthread.php?t=1014708&page=29 nothing changed, same error. I created a thread in microsoft forums - http://social.answers.microsoft.com/Forums/en-US/w7install/thread/69517faf-850a-45fd- 8195-6d4ed831f805 but I couldn't find a solution. Before I run chkdsk from usb repair disk I couldn't able to mount Windows (NTFS) partition from Ubuntu, I was getting "couldn't mount file system, error code 2". I tried to fix ntfs partition from ubuntu and got "segmentation fault". I also created a thread on ubuntuforums for this mount problem: - http://ubuntuforums.org/showthread.php?t=1606427 So, after chkdsk, I could enable to mount windows partition but all I see in this partition is chkdsk logs, no any other data. Now, I don't think I lost my data because I don't get any filesystem errors, just the boot section, but this log files under windows partition makes me afraid. I see that Microsoft developers don't have a solution yet for this error. If you need any information to get more idea I can give, maybe I miss some points or it could be complicated. Thanks in advance.

Read the article
How can I force a merge of all WAL files in pg_xlog back into my base "data" directory?

- by Zac B

Question: Is there a way to tell Postgres (9.2) to "merge all WAL files in pg_xlog back into the non-WAL data files, and then delete all WAL files successfully merged?" I would like to be able to "force" this operation; i.e. checkpoint_segments or archiving settings should be ignored. The filesystem WAL buffer (pg_xlog) directory should be emptied, or nearly emptied. It's fine if some or all of the space consumed by the pg_xlog directory is then consumed by the data directory; our DBA has asked for WAL database backups without any backlogged WALs, but space consumption is not a concern. Having near-zero WAL activity during this operation is a fine constraint. I can ensure that the database server is either shut down or not connectible (zero user-generated transaction load) during this process. Essentially, I'd like Postgres to ignore archiving/checkpoint retention policies temporarily, and flush all WAL activity to the core database files, leaving pg_xlog in the same state as if the database were recently created--with very few WAL files. What I've Tried: I know that the pg_basebackup utility performs something like this (it generates an almost-all-WALs-merged copy of a Postgres instance's data directory), but we aren't ready to use it on all our systems yet, as we are still testing replication settings; I'm hoping for a more short-term solution. I've tried issuing CHECKPOINT commands, but they just recycle one WAL file and replace it with another (that is, if they do anything at all; if I issue them during database idle time, they do nothing). pg_switch_xlog() similarly just forces a switch to the next log segment; it doesn't flush all queued/buffered segments. I've also played with the pg_resetxlog utility. That utility sort of does what I want, but all of its usage docs seem to indicate that it destroys (rather than flushing out of the transaction log and into the main data files) some or all of the WAL data. Is that impression accurate? If not, can I use pg_resetxlog during a zero-WAL-activity period to force a flush of all queued WAL data to non-WAL data? If the answer to that is negative, how can I achieve this goal? Thanks!

Read the article
Converting an EC2 AMI to vmdk image

- by Reed G. Law

I've come quite close to getting Amazon Linux to boot inside VirtualBox, thanks to this answer and these websites. A quick overview of the steps I've taken: Launch EC2 instance with Amazon Linux 2011.09 64-bit AMI dd the contents of the EBS volume over ssh to a local image file. Mount the image file as a loopback device and then to a local mount point. Create a new empty disk image file, partition with an offset for a bootloader, and create an ext4 filesystem. Mount the new image's partition and copy everything from the EC2 image. Install grub (using Ubuntu's grub-legacy-ec2 package, not grub2). Convert the image file to vmdk using qemu-img. Create a new VirtualBox VM with the vmdk. Now the VM boots, grub loads, and the kernel is found. But it fails when it tries to mount the root device: dracut Warning: No root device "block:/dev/xvda1" found dracut Warning: Boot has failed. To debug this issue add "rdshell" to the kernel command line. dracut Warning: Signal caught! dracut Warning: Boot has failed. To debug this issue add "rdshell" to the kernel command line. Kernel panic - not syncing: Attempted to kill init! Pid: 1, comm: init Not tainted 2.6.35.14-107.1.39.amzn1.x86_64 #1 I have tried changing /boot/grub/menu.lst to find the root device by label and UUID, but nothing works. I'm guessing the xen kernel is not compatible with VirtualBox. The reasoning behind all this effort is to make a Vagrant box that is as close to possible as the production enviroment, so deploys can be tested locally. I know it's cheap to do test runs on EC2, but poor connectivity often ruins the experience. Plus it would be really nice to have a virtual machine with the production environment so that co-workers don't have to install everything under the sun just to get up and running with app development. If I were to try running a different kernel, what kernel could I get to be as close as possible to Amazon Linux 2011.09?

Read the article
Resizing Partitions on Live RHEL/cPanel Server

- by Timothy R. Butler

I've resized many partitions over the years on Linux, Windows and Mac OS X -- but always using a GUI. However, the time has come where the preset partition sizes my data center placed on my server aren't the right sizes and I need to resize a production server's disks. I could fiddle with it and probably do OK, but given that it is a production server, I wanted to get some advice about the right way to do this. I do have KVM over IP access, so if it is best to take the server offline and boot off a rescue partition, I can do that. root [/var/lib/mysql]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 9.9G 2.1G 7.3G 23% / tmpfs 7.8G 0 7.8G 0% /dev/shm /dev/sda1 99M 77M 18M 82% /boot /dev/sda8 884G 463G 376G 56% /home /dev/sda3 9.9G 8.0G 1.5G 85% /usr /dev/sda5 9.9G 9.1G 308M 97% /var /usr/tmpDSK 2.0G 38M 1.8G 3% /tmp As you can see /var and /usr are quite close to being full and I've actually had to symlink some logs on /usr to directories in /home to balance things out. What I would like to do is to add 6-10 GB each to /usr and /var, presumably taking the space from /home. As I think about how the disk is arranged, the best thought I've come up with is to reduce /home by 16 GB, say, and move /var to the spot freed up, then allocating /var's space to /usr. However, that would put /var at the far end of the disk, which seems less than idea, given that MySQL has all of its data on that partition. I'd love to take the space out of the closer end of /usr, but I assume that would take a very arduous (and perhaps risky) process of moving all of the data in /usr around. I seem to recall having such a process fail for me on a computer in the past. The other option might be to merge / and /usr since / is underutilized, though I'm not sure if that's a good idea. Do you have any suggestions both on the best reallocation plan and the commands to use to accomplish it? UPDATE: I should add -- here's the partition table. There's one unused partition, which, if memory serves, was the original tmp location before I created a tmp image: Name Flags Part Type FS Type [Label] Size (MB) ------------------------------------------------------------------------------ Unusable 1.05* sda1 Boot Primary Linux ext2 106.96* sda2 Primary Linux ext3 10737.42* sda3 Primary Linux ext3 10737.42* sda5 NC Logical Linux ext3 10738.47* sda6 NC Logical Linux swap / Solaris 2148.54* sda7 NC Logical Linux ext3 1074.80* sda8 NC Logical Linux ext3 964098.53*

Read the article
How to get more information from the system crash

- by viraptor

I'd like to debug an issue I'm having with a linux (debian stable) server, but I'm running out of ideas of how to confirm any diagnosis. Some background: The servers are running DL160 class with hardware raid between two disks. They're running a lot of services, mostly utilising network interface and CPU. There are 8 cpus and 7 "main" most cpu-hungry processes are bound to one core each via cpu affinity. Other random background scripts are not forced anywhere. The filesystem is writing ~1.5k blocks/s the whole time (goes up above 2k/s in peak times). Normal CPU usage for those servers is ~60% on 7 cores and some minimal usage on the last (whatever's running on shells usually). What actually happens is that the "main" services start using 100% CPU at some point, mainly stuck in kernel time. After a couple of seconds, LA goes over 400 and we lose any way to connect to the box (KVM is on it's way, but not there yet). Sometimes we see a kernel reporting hung task (but not always): [118951.272884] INFO: task zsh:15911 blocked for more than 120 seconds. [118951.272955] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [118951.273037] zsh D 0000000000000000 0 15911 1 [118951.273093] ffff8101898c3c48 0000000000000046 0000000000000000 ffffffffa0155e0a [118951.273183] ffff8101a753a080 ffff81021f1c5570 ffff8101a753a308 000000051f0fd740 [118951.273274] 0000000000000246 0000000000000000 00000000ffffffbd 0000000000000001 [118951.273335] Call Trace: [118951.273424] [<ffffffffa0155e0a>] :ext3:__ext3_journal_dirty_metadata+0x1e/0x46 [118951.273510] [<ffffffff804294f6>] schedule_timeout+0x1e/0xad [118951.273563] [<ffffffff8027577c>] __pagevec_free+0x21/0x2e [118951.273613] [<ffffffff80428b0b>] wait_for_common+0xcf/0x13a [118951.273692] [<ffffffff8022c168>] default_wake_function+0x0/0xe .... This would point at raid / disk failure, however sometimes the tasks are hung on kernel's gettsc which would indicate some general weird hardware behaviour. It's also running mysql (almost read-only, 99% cache hit), which seems to spawn a lot more threads during the system problems. During the day it does ~200kq/s (selects) and ~10q/s (writes). The host is never running out of memory or swapping, no oom reports are spotted. We've got many boxes with similar/same hardware and they all seem to behave that way, but I'm not sure which part fails, so it's probably not a good idea to just grab something more powerful and hope the problem goes away. Applications themselves don't really report anything wrong when they're running. I can run anything safely on the same hardware in an isolated environment. What can I do to narrow down the problem? Where else should I look for explanation?

Read the article
Unecrypted Image of Truecrypt-Encrypted System Partition

- by Dexter

The general tenor around the internet seems to be that you can't create images of system partitions that have been encrypted (with truecrypt) other than with dd or similar sector-by-sector copy tools. These files however are very impractical given their size (and are obviously incompressible) which makes keeping multiple states/backups of your system partition rather expensive (..especially considering current hdd prices). The problem is that backup tools (like Acronis True Image, Clonezilla, etc.) won't give you the option to create an image of (mounted/opened) Truecrypt partitions, or that there is no recovery environment for restoring the backup, that would allow to run truecrypt before doing any actual restoring. After some trial and error however, I believe I have found a very simple way. Since Truecrypt (running in Linux) creates a virtual block device, that it uses for mounting the unencrypted partitions into the file system, partclone can be used for creating/restoring images. What I did: boot up a linux live disk mount/open the drive/device/partition in truecrypt unmount the filesystem mount point again, like so: umount /media/truecryptX ("X" being the partition number assigend by truecrypt) use partclone (this is what clonezilla would do too, except that clonezilla only offers you to back up real drive partitions, not virtual block devices): partclone.ntfs -c -s /dev/mapper/truecryptX -o nameOfBackupFile for restoring steps 1-3 remain the same, and step 4 is partclone.ntfs -r -s nameOfBackupFile -o /dev/mapper/truecryptX A backup and test-restore of the system (with this method) seems to have worked fine (and the changed settings were reverted to the backup-state). The backup file is ~40 GB (and compressible down to <8GB with 7zip/LZMA2 on the "fast" setting). I can't quite believe that I'm the only one that wants to create images of encrypted drives, but doesn't want to waste 100GB on the backup of one single system state. So my question now is, given how simple this was, and that no one seems to mention anywhere that this is possible - did I miss something? or did I do something wrong? Is there any situation that I didn't think of where this method will fail? Obviously, the backup file needs to be stored in some other encrypted place in order to still remain confidential, since it is unencrypted. Also, in order to do a full "bare metal" restore, one would have to actually first (re-)install Windows, encrypt it, and only then restore the backup file. The funny thing however is that you won't need to backup any partition tables, etc. since the reinstall will effectively take care of that. Is there anything else? This is imho still a lot better than having sector-by-sector images..

Read the article
Cannot install grub to RAID1 (md0)

- by Andrew Answer

I have a RAID1 array on my Ubuntu 12.04 LTS and my /sda HDD has been replaced several days ago. I use this commands to replace: # go to superuser sudo bash # see RAID state mdadm -Q -D /dev/md0 # State should be "clean, degraded" # remove broken disk from RAID mdadm /dev/md0 --fail /dev/sda1 mdadm /dev/md0 --remove /dev/sda1 # see partitions fdisk -l # shutdown computer shutdown now # physically replace old disk by new # start system again # see partitions fdisk -l # copy partitions from sdb to sda sfdisk -d /dev/sdb | sfdisk /dev/sda # recreate id for sda sfdisk --change-id /dev/sda 1 fd # add sda1 to RAID mdadm /dev/md0 --add /dev/sda1 # see RAID state mdadm -Q -D /dev/md0 # State should be "clean, degraded, recovering" # to see status you can use cat /proc/mdstat This is the my mdadm output after sync: /dev/md0: Version : 0.90 Creation Time : Wed Feb 17 16:18:25 2010 Raid Level : raid1 Array Size : 470455360 (448.66 GiB 481.75 GB) Used Dev Size : 470455360 (448.66 GiB 481.75 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Nov 1 15:19:31 2012 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 92e6ff4e:ed3ab4bf:fee5eb6c:d9b9cb11 Events : 0.11049560 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 After bebuilding completion "fdisk -l" says what I have not valid partition table /dev/md0. This is my fdisk -l output: Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00057d19 Device Boot Start End Blocks Id System /dev/sda1 * 63 940910984 470455461 fd Linux raid autodetect /dev/sda2 940910985 976768064 17928540 5 Extended /dev/sda5 940911048 976768064 17928508+ 82 Linux swap / Solaris Disk /dev/sdb: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000667ca Device Boot Start End Blocks Id System /dev/sdb1 * 63 940910984 470455461 fd Linux raid autodetect /dev/sdb2 940910985 976768064 17928540 5 Extended /dev/sdb5 940911048 976768064 17928508+ 82 Linux swap / Solaris Disk /dev/md0: 481.7 GB, 481746288640 bytes 2 heads, 4 sectors/track, 117613840 cylinders, total 940910720 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/md0 doesn't contain a valid partition table This is my grub install output: root@answe:~# grub-install /dev/sda /usr/sbin/grub-setup: warn: Attempting to install GRUB to a disk with multiple partition labels or both partition label and filesystem. This is not supported yet.. /usr/sbin/grub-setup: error: embedding is not possible, but this is required for cross-disk install. root@answe:~# grub-install /dev/sdb Installation finished. No error reported. So 1) "update-grub" find only /sda and /sdb Linux, not /md0 2) "dpkg-reconfigure grub-pc" says "GRUB failed to install the following devices /dev/md0" I cannot load my system except from /sdb1 and /sda1, but in DEGRADED mode... Anybody can resolve this issue? I have big headache with this.

Read the article

Search Results

Search found 2834 results on 114 pages for 'filesystem corruption'.

Page 106/114 | < Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113 | Next Page >

- by Ashok_Ora

- by Liu Maclean(???)

- by cswingle

- by dma_k

- by Richard Whitman

- by MetalSearGolid

- by racitup

- by dlo

- by Abel Coto

- by Xeoncross

- by Kani

- by Henadzy

- by Henadzy

- by a_m0d

- by wazoox

- by jemmille

- by James

- by yurko

- by piratejackus

- by Zac B

- by Reed G. Law

- by Timothy R. Butler

- by viraptor

- by Dexter

- by Andrew Answer

< Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113 | Next Page >