bytes - Page 113 - Developer IT

Why is my /dev/random so slow when using dd?

- by Mikey

I am trying to semi-securely erase a bunch of hard drives. The following is working at 20-50Mb/s dd if=/dev/zero of=/dev/sda But dd if=/dev/random of=/dev/sda seems not to work. Also when I type dd if=/dev/random of=stdout It only gives me a few bytes regardless of what I pass it for bs= and count= Am I using /dev/random wrong? What other info should I look for to move this troubleshooting forward? Is there some other way to do this with a script or something like makeMyLifeEasy | dd if=stdin of=/dev/sda Or something like that...

Read the article

Shell script only executes partially when run with CRON

- by binaryorganic

I've written a shell script that does the following: Retrieve mail from a POP3 account (using GetMail) Save a copy of that email to S3 (using AWS CLI) Email me the filesize of the email The script runs fine manually, and technically runs from CRON, but it only seems to be sending the email. The getmail and S3 bits don't seem to run. Everything I've read seems to hammer home the message that I need to be careful about relative paths and the like when using CRON, but I think I'm using absolute paths everywhere I need to be, so I'm stumped as to what the issue could be. My Shell Script is here: #!/bin/bash # Run GetMail getmail -r /PATH/TO/EMAIL/getmail.email # Save to S3 aws s3 cp /PATH/TO/SCRIPT/email-backup.mbox s3://XXXXXXXXXX/email-backup.mbox # Send Confirmation Email SUBJECT="EMAIL SUBJECT" EMAIL="[email protected]" # Get current filesize FILENAME=/PATH/TO/SCRIPT/email-backup.mbox FILESIZE=$(stat -c%s "$FILENAME") # Email Content EMAILMESSAGE="/tmp/emailmessage.txt" echo "EMAIL BODY" >$EMAILMESSAGE echo "" >>$EMAILMESSAGE echo "Current File Size: $FILESIZE bytes" >>$EMAILMESSAGE # Send the Mail /bin/mail -s "$SUBJECT" "$EMAIL" < $EMAILMESSAGE

Read the article

Windows/Samba connection error

- by Gomibushi

I have a Linux fileserver serving up /home for linux and windows users. I was able to connect from my windows client, but not from a DC. Then suddenly I could connect from the DC too. The linux servers run Centrify clients, and as such are part of the domain. All on same subnet. This is what the the log.smbd says, repeatedly: [2010/02/11 11:25:57, 0] lib/util_sock.c:read_data(534) read_data: read failure for 4 bytes to client 192.168.200.3. Error = Connection reset by peer On Windows it appeared as an "unknown error". EDIT: the error code is "0x80004005". We are developing a system depended on the samba share, and are worried this will appear again. It would be nice to pin point the root of this. Any ideas what this might be? Places to look?

Read the article

Split a Large File In C++

- by wdow88

Hey all, I'm trying to write a program that takes a large file (of any time) and splits it into many smaller "chunks". I think I have the basic idea down, but for some reason I cannot create a chunk size over 12,000 bites. I know there are a few solutions on google, etc. but I am more interested in learning what the origin of this limitation is then actually using the program to split files. //This file splits are larger into smaller files of a user inputted size. #include<iostream> #include<fstream> #include<string> #include<sstream> #include <direct.h> #include <stdlib.h> using namespace std; void GetCurrentPath(char* buffer) { _getcwd(buffer, _MAX_PATH); } int main() { // use the function to get the path char CurrentPath[_MAX_PATH]; GetCurrentPath(CurrentPath);//Get the current directory (used for displaying output) fstream bigFile; string filename; int partsize; cout << "Enter a file name: "; cin >> filename; //Recieve target file cout << "Enter the number of bites in each smaller file: "; cin >> partsize; //Recieve volume size bigFile.open(filename.c_str(),ios::in | ios::binary); bigFile.seekg(0, ios::end); // position get-ptr 0 bytes from end int size = bigFile.tellg(); // get-ptr position is now same as file size bigFile.seekg(0, ios::beg); // position get-ptr 0 bytes from beginning for (int i = 0; i <= (size / partsize); i++) { //Build File Name string partname = filename; //The original filename string charnum; //archive number stringstream out; //stringstream object out, used to build the archive name out << "." << i; charnum = out.str(); partname.append(charnum); //put the part name together //Write new file part fstream filePart; filePart.open(partname.c_str(),ios::out | ios::binary); //Open new file with the name built above //Check if near the end of file if (bigFile.tellg() < (size - (size%partsize))) { filePart.write(reinterpret_cast<char *>(&bigFile),partsize); //Write the selected amount to the file filePart.close(); //close file bigFile.seekg(partsize, ios::cur); //move pointer to next position to be written } //Changes the size of the last volume because it is the end of the file else { filePart.write(reinterpret_cast<char *>(&bigFile),(size%partsize)); //Write the selected amount to the file filePart.close(); //close file } cout << "File " << CurrentPath << partname << " produced" << endl; //display the progress of the split } bigFile.close(); cout << "Split Complete." << endl; return 0; } Any ideas? Thanks!

Read the article

Random TCP Resets

- by allenwei

We got randomly TCP "reset" error when we send request to remote server. Log from remote server Cisco TCP Connection Terminated,Nov 05 14:43:39 EST: %ASA-session-6-302014: Teardown TCP connection 640068283 for Outside:xxxx to xxxx duration 0:00:00 bytes 4160 TCP Reset-O One my local machine I saw when I use netstat 100703 connections reset due to unexpected data 324186 connections reset due to early user close I also use tcpdump to see what's wrong with it, I saw xxxx.https: Flags [R.], seq 290, ack 1369, win 136, options [nop,nop,TS val 2871790533 ecr 1897173283], length 0 The problem just happened today, we didn't change anything on our server. Anyone know what's wrong with it? Is it related to code we wrote send out request or related to linux configuration?

Read the article

kernel 2.6.36 not booting

- by Saumitra

Hi, I m a newbie to kernel programming. I am trying to boot the kernel 2.6.36 on my ECG machine.It was working perfectly on 2.6.33.2. It is getting stuck on this step: ## Booting kernel from Legacy Image at 81000000 ... Image Name: Created: 2010-12-27 5:55:56 UTC Image Type: MIPS Linux Kernel Image (gzip compressed) Data Size: 1974278 Bytes = 1.9 MB Load Address: 80100000 Entry Point: 80104730 Verifying Checksum ... OK Uncompressing Kernel Image ... OK Starting kernel ... After this the system either resets or it hangs. I have also checked the configuration & set it properly.Please let me know.

Read the article

Database implementation question? [closed]

- by gundam

consider a disk with a sector size of 512 bytes, 2000 tracks/surface, 50 sectors/track, 5 doubled sided platters, average seek time is 10 msec. Assume a block size of 1024-byte is selected. Assume a file that contains 100,000 records of 100-byte each is to be stored on the disk, and NONE of the reocd can be spanned 2 blocks. How many blocks are needed to store the entire file?? If the file is arranged sequentially on disk, how many surfaces are required?? Now, i have calculated that 10,000 blocks are needed to store 100,000 records. But i am not sure how to find out the answer of the surfaces required. I only calculated the capacity of track is 25KB and capacity of surface is 50,000 KB But I don't know how to calculate the number of surfaces... Could anyone help me how to get the answer? Thanks a lot!!

Read the article

What command line tools for monitoring host network activity on linux do you use?

- by user27388

What command line tools are good for reliably monitoring network activity? I have used ifconfig, but an office colleague said that its statistics are not always reliable. Is that true? I have recently used ethtool, but is it reliable? What about just looking at /proc/net 'files'? Is that any better? EDIT I'm interested in packets Tx/Rx, bytes Tx/Rx, but most importantly drops or errors and why the drop/error might have occurred.

Read the article

Would SSD drives benefit from a non-default allocation unit size?

- by davebug

The default allocation unit size recommended when formatting a drive in our current set-up is 4096 bytes. I understand the basics of the pros and cons of larger and smaller sizes (performance boost vs. space preservation) but it seems the benefits of a solid state drive (seek times massively lower than hard disks) may create a situation where a much smaller allocation size is not detrimental. Were this the case it would at least partially help to overcome the disadvantage of SSD (massively higher prices per GB). Is there a way to determine the 'cost' of smaller allocation sizes specifically related to seek times? Or are there any studies or articles recommending a change from the default based on this newer tech? (Assume the most average scattering of sizes program files, OS files, data, mp3s, text files, etc.)

Read the article

Is there any script to do accounting for the proftpd's xferlog?

- by Aseques

I would like to convert from the xferlog format that proftpd uses into per user in/out bytes, to have a summary on how much traffic does each user use per month. The exact format is this: Thu Oct 17 12:47:05 2013 1 123.123.123.123 74852 /home/vftp/doc1.txt b _ i r user ftp 0 * c Thu Oct 17 12:47:06 2013 2 123.123.123.123 86321 /home/vftp/doc2.txt b _ i r user ftp 0 * c So far I only found a script that makes a nice report but not exactly what I needed, that one can be found here I might create a fork of this one and place it somewhere but it probably has been done a lot of times already. Just found a well hidden page in proftpd site with some more examples here

Read the article

Can't delete file from windows 7

- by r.s.mahanti

I downloaded a torrent file from internet. The problem is it's size is showing 0 bytes. I tried to scan it with antivirus, upload it to virus total, delete it but it's showing the file is not found. I tried to delete it in safe mode also but no success. Can anybody explain me, what can be reason for this and what is the way to delete this file? Thanks in advance. My operating system is windows 7. EDIT : the name of the file is "[Torrentreactor.to] - Site Translator 4.06.torrent."

Read the article

Override `drop` for a custom sequence

- by Bruno Reis

In short: in Clojure, is there a way to redefine a function from the standard sequence API (which is not defined on any interface like ISeq, IndexedSeq, etc) on a custom sequence type I wrote? 1. Huge data files I have big files in the following format: A long (8 bytes) containing the number n of entries n entries, each one being composed of 3 longs (ie, 24 bytes) 2. Custom sequence I want to have a sequence on these entries. Since I cannot usually hold all the data in memory at once, and I want fast sequential access on it, I wrote a class similar to the following: (deftype DataSeq [id ^long cnt ^long i cached-seq] clojure.lang.IndexedSeq (index [_] i) (count [_] (- cnt i)) (seq [this] this) (first [_] (first cached-seq)) (more [this] (if-let [s (next this)] s '())) (next [_] (if (not= (inc i) cnt) (if (next cached-seq) (DataSeq. id cnt (inc i) (next cached-seq)) (DataSeq. id cnt (inc i) (with-open [f (open-data-file id)] ; open a memory mapped byte array on the file ; seek to the exact position to begin reading ; decide on an optimal amount of data to read ; eagerly read and return that amount of data )))))) The main idea is to read ahead a bunch of entries in a list and then consume from that list. Whenever the cache is completely consumed, if there are remaining entries, they are read from the file in a new cache list. Simple as that. To create an instance of such a sequence, I use a very simple function like: (defn ^DataSeq load-data [id] (next (DataSeq. id (count-entries id) -1 []))) ; count-entries is a trivial "open file and read a long" memoized As you can see, the format of the data allowed me to implement count in very simply and efficiently. 3. drop could be O(1) In the same spirit, I'd like to reimplement drop. The format of these data files allows me to reimplement drop in O(1) (instead of the standard O(n)), as follows: if dropping less then the remaining cached items, just drop the same amount from the cache and done; if dropping more than cnt, then just return the empty list. otherwise, just figure out the position in the data file, jump right into that position, and read data from there. My difficulty is that drop is not implemented in the same way as count, first, seq, etc. The latter functions call a similarly named static method in RT which, in turn, calls my implementation above, while the former, drop, does not check if the instance of the sequence it is being called on provides a custom implementation. Obviously, I could provide a function named anything but drop that does exactly what I want, but that would force other people (including my future self) to remember to use it instead of drop every single time, which sucks. So, the question is: is it possible to override the default behaviour of drop? 4. A workaround (I dislike) While writing this question, I've just figured out a possible workaround: make the reading even lazier. The custom sequence would just keep an index and postpone the reading operation, that would happen only when first was called. The problem is that I'd need some mutable state: the first call to first would cause some data to be read into a cache, all the subsequent calls would return data from this cache. There would be a similar logic on next: if there's a cache, just next it; otherwise, don't bother populating it -- it will be done when first is called again. This would avoid unnecessary disk reads. However, this is still less than optimal -- it is still O(n), and it could easily be O(1). Anyways, I don't like this workaround, and my question is still open. Any thoughts? Thanks.

Read the article

Pinging 192.168.1.4 is looking to 192.168.1.2?

- by BVernon

When I run the command "ping 192.168.1.4" I am get the following results: Pinging 192.168.1.4 with 32 bytes of data: Reply from 192.168.1.2: Destination host unreachable. Reply from 192.168.1.2: Destination host unreachable. Reply from 192.168.1.2: Destination host unreachable. Reply from 192.168.1.2: Destination host unreachable. Can anyone help me understand why in the world it's telling me that 192.168.1.2 is unreachable when that's not even the ip address I typed in? I'm very confused. Also in case it's relevant, I'm on a workgroup.

Read the article

How can I monitor network usage by process on Mac OS X?

- by psmith

Is there any way to find out which process using how much internet bandwidth on Mac OS X Lion? I'm on mobile internet now, which is not very fast, so it would be nice if I can tell that for example, Chrome using 10kB/s, and Skype using 2kB/s. I can see the total amount of traffic in Activity Monitor, but it is not enough for me. I'd like to use an existing application, not interested to write an app like this. And I'm not interested in the actual traffic, only the number of bytes transferred and received by each processes.

Read the article

Total network data sent/received of a non-daemon Linux process?

- by leden

I'm looking for a simple and effective way of measuring total bytes received/sent from a single process upon its termination. Basically, I am looking for a tool which has the interface similar to "time" and "/usr/bin/time", e.g. measure-net-data <prog_to_run> <prog_args> Received (b): XYZ Sent (b): ABC I know that there are many tools for bandwidth/network monitoring, but as far I can tell all of them are performing the measurements it real-time, which is inappropriate not only because of overhead but also because of the inconvenience - I would need to stop the program, capture the output of the tool and then kill it. I have seen that newer versions of Linux 2.6.20+ provide /proc/<pid>/io/ which contain the information I'm looking for; however, everything under /proc/<pid> when the process terminates, so I'm again back to the same problem as with any network monitoring tool.

Read the article

Can't ping my computer - "Transmit failed. General failure."

- by Vaccano

I am having an issue with my computer. My IIS services are not working. I have narrowed it down to the fact that my computer cannot find itself via its name. I try pinging my computer by its name and I get this: C:\Users\18773ping MyComputerNameHere Pinging MyComputerNameHere [::1] with 32 bytes of data: PING: transmit failed. General failure. PING: transmit failed. General failure. PING: transmit failed. General failure. PING: transmit failed. General failure. Ping statistics for ::1: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), I tried having someone else ping my machine and it works fine for them. Any ideas?

Read the article

Nested loop traversing arrays

- by alecco

There are 2 very big series of elements, the second 100 times bigger than the first. For each element of the first series, there are 0 or more elements on the second series. This can be traversed and processed with 2 nested loops. But the unpredictability of the amount of matching elements for each member of the first array makes things very, very slow. The actual processing of the 2nd series of elements involves logical and (&) and a population count. I couldn't find good optimizations using C but I am considering doing inline asm, doing rep* mov* or similar for each element of the first series and then doing the batch processing of the matching bytes of the second series, perhaps in buffers of 1MB or something. But the code would be get quite messy. Does anybody know of a better way? C preferred but x86 ASM OK too. Many thanks! Sample/demo code with simplified problem, first series are "people" and second series are "events", for clarity's sake. (the original problem is actually 100m and 10,000m entries!) #include <stdio.h> #include <stdint.h> #define PEOPLE 1000000 // 1m struct Person { uint8_t age; // Filtering condition uint8_t cnt; // Number of events for this person in E } P[PEOPLE]; // Each has 0 or more bytes with bit flags #define EVENTS 100000000 // 100m uint8_t P1[EVENTS]; // Property 1 flags uint8_t P2[EVENTS]; // Property 2 flags void init_arrays() { for (int i = 0; i < PEOPLE; i++) { // just some stuff P[i].age = i & 0x07; P[i].cnt = i % 220; // assert( sum < EVENTS ); } for (int i = 0; i < EVENTS; i++) { P1[i] = i % 7; // just some stuff P2[i] = i % 9; // just some other stuff } } int main(int argc, char *argv[]) { uint64_t sum = 0, fcur = 0; int age_filter = 7; // just some init_arrays(); // Init P, P1, P2 for (int64_t p = 0; p < PEOPLE ; p++) if (P[p].age < age_filter) for (int64_t e = 0; e < P[p].cnt ; e++, fcur++) sum += __builtin_popcount( P1[fcur] & P2[fcur] ); else fcur += P[p].cnt; // skip this person's events printf("(dummy %ld %ld)\n", sum, fcur ); return 0; } gcc -O5 -march=native -std=c99 test.c -o test

Read the article

Program using read() entering into an infinite loop

- by Soham

1oid ReadBinary(char *infile,HXmap* AssetMap) { int fd; size_t bytes_read, bytes_expected = 100000000*sizeof(char); char *data; if ((fd = open(infile,O_RDONLY)) < 0) err(EX_NOINPUT, "%s", infile); if ((data = malloc(bytes_expected)) == NULL) err(EX_OSERR, "data malloc"); bytes_read = read(fd, data, bytes_expected); if (bytes_read != bytes_expected) printf("Read only %d of %d bytes %d\n", \ bytes_read, bytes_expected,EX_DATAERR); /* ... operate on data ... */ printf("\n"); int i=0; int counter=0; char ch=data[0]; char message[512]; Message* newMessage; while(i!=bytes_read) { while(ch!='\n') { message[counter]=ch; i++; counter++; ch =data[i]; } message[counter]='\n'; message[counter+1]='\0'; //--------------------------------------------------- newMessage = (Message*)parser(message); MessageProcess(newMessage,AssetMap); //-------------------------------------------------- //printf("idNUM %e\n",newMessage->idNum); free(newMessage); i++; counter=0; ch =data[i]; } free(data); } Here, I have allocated 100MB of data with malloc, and passed a file big enough(not 500MB) size of 926KB about. When I pass small files, it reads and exits like a charm, but when I pass a big enough file, the program executes till some point after which it just hangs. I suspect it either entered an infinite loop, or there is memory leak. EDIT For better understanding I stripped away all unnecessary function calls, and checked what happens, when given a large file as input. I have attached the modified code void ReadBinary(char *infile,HXmap* AssetMap) { int fd; size_t bytes_read, bytes_expected = 500000000*sizeof(char); char *data; if ((fd = open(infile,O_RDONLY)) < 0) err(EX_NOINPUT, "%s", infile); if ((data = malloc(bytes_expected)) == NULL) err(EX_OSERR, "data malloc"); bytes_read = read(fd, data, bytes_expected); if (bytes_read != bytes_expected) printf("Read only %d of %d bytes %d\n", \ bytes_read, bytes_expected,EX_DATAERR); /* ... operate on data ... */ printf("\n"); int i=0; int counter=0; char ch=data[0]; char message[512]; while(i<=bytes_read) { while(ch!='\n') { message[counter]=ch; i++; counter++; ch =data[i]; } message[counter]='\n'; message[counter+1]='\0'; i++; printf("idNUM \n"); counter=0; ch =data[i]; } free(data); } What looks like is, it prints a whole lot of idNUM's and then poof segmentation fault I think this is an interesting behaviour, and to me it looks like there is some problem with memory FURTHER EDIT I changed back the i!=bytes_read it gives no segmentation fault. When I check for i<=bytes_read it blows past the limits in the innerloop.(courtesy gdb)

Read the article

Intel Corporation Ethernet Connection does not start properly

- by Oscar Alejos

I'm experiencing some problems when trying to connect my PC to the router through a switch. When the PC is directly connected to the router, everything works fine, Ubuntu (14.04) starts normally, and the Internet connection runs inmediately. The Ethernet controller is the Intel Corporation Ethernet Connection, as lspci returns: $ lspci | grep Eth 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-V (rev 04) However, when I try to connect through the switch what I get is the following. dmesg returns: $ dmesg | grep eth [ 1.035585] e1000e 0000:00:19.0 eth0: registered PHC clock [ 1.035587] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:22:4d:a7:be:5d [ 1.035589] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection [ 1.035625] e1000e 0000:00:19.0 eth0: MAC: 11, PHY: 12, PBA No: FFFFFF-0FF [ 1.357838] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 2.165413] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 2.165574] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 2.641287] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 16.715086] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx [ 16.715090] e1000e 0000:00:19.0 eth0: 10/100 speed: disabling TSO [ 16.715117] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready It looks like eth0 is properly working. Actually, nm-tool returns: $ nm-tool - Device: eth0 [Conexión cableada] ------------------------------------------- Type: Wired Driver: e1000e State: connected Default: yes HW Address: 00:22:4D:A7:BE:5D Capabilities: Carrier Detect: yes Speed: 100 Mb/s Wired Properties Carrier: on IPv4 Settings: Address: 192.168.1.30 Prefix: 24 (255.255.255.0) Gateway: 192.168.1.1 DNS: 80.58.61.250 DNS: 80.58.61.254 DNS: 192.168.1.1 However, ping returns: $ ping 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. From 192.168.1.30 icmp_seq=1 Destination Host Unreachable From 192.168.1.30 icmp_seq=2 Destination Host Unreachable From 192.168.1.30 icmp_seq=3 Destination Host Unreachable The connection is restored by restarting it: # ifconfig eth0 down # ifconfig eth0 up From this point on, everything runs smoothly, as if the PC were directly connected to the router. It seems to be an issue related to the integrated LAN adaptor and the Ethernet controller, since my laptop connects without any problem. My desktop board is an Intel DB85FL. I'd be grateful if anyone could give some ideas on how to solve this issue. Thank you in advance.

Read the article

Obama’s win and the geeky stuff behind his success

- by Gopinath

Mr. Obama is elected as President of United States for the second term with a great majority. Here are few geeky bytes associated with Mr.Obama’s success and the world records he set up Obama Creates New Records on Twitter and Facebook – Just after winning the race for president, Barak Obama setup world records on Twitter & Facebook. His tweet “Four more years” re-tweeted more than 770K times and his Facebook post received 3.8 million likes. Twitter Kills the Fail Whale, One Tweet at a Time – Twitter handled insane user loads with millions of tweets related to election and proved that it’s platform is now more robust than ever. In a post on the company’s engineering blog, Twitter said people sent 31 million election-related tweets on Tuesday alone. From 8:11 p.m. to 9:11 p.m. P.S.T., Twitter processed an average 9,965 tweets per second, with a one-second peak of 15,107 tweets per second at 8:20 p.m., the company said. Obama’s win a big vindication for Nate Silver, king of the quants – Nate Silver, a statistician and blogger was spot on in predicting Obama’s win many weeks. He did not depend on astrology or surveys to predict Obama’s success. He used big data and statistical analysis to project votes. CNET says “Despite some incredulous political pundits, the FiveThirtyEight statistician appears to have correctly predicted the winner in all 50 states in the presidential election” Inside the Secret World of the Data Crunchers Who Helped Obama Win – Team Obama used big data and analytical systems to win the elections!! Barack Obama Goes On Reddit For Last Campaign Stop Of Political Career – Reditt is getting a lot of Obama’s attention these days. He is quite often stopping by Reditt to say hi to the nerds & geeks hanging out there. His chat with nerds on Reddit has paid rich dividends.

Read the article

Why are data structures so important in interviews?

- by Vamsi Emani

I am a newbie into the corporate world recently graduated in computers. I am a java/groovy developer. I am a quick learner and I can learn new frameworks, APIs or even programming languages within considerably short amount of time. Albeit that, I must confess that I was not so strong in data structures when I graduated out of college. Through out the campus placements during my graduation, I've witnessed that most of the biggie tech companies like Amazon, Microsoft etc focused mainly on data structures. It appears as if data structures is the only thing that they expect from a graduate. Adding to this, I see that there is this general perspective that a good programmer is necessarily a one with good knowledge about data structures. To be honest, I felt bad about that. I write good code. I follow standard design patterns of coding, I do use data structures but at the superficial level as in java exposed APIs like ArrayLists, LinkedLists etc. But the companies usually focused on the intricate aspects of Data Structures like pointer based memory manipulation and time complexities. Probably because of my java-ish background, Back then, I understood code efficiency and logic only when talked in terms of Object Oriented Programming like Objects, instances, etc but I never drilled down into the level of bits and bytes. I did not want people to look down upon me for this knowledge deficit of mine in Data Structures. So really why all this emphasis on Data Structures? Does, Not having knowledge in Data Structures really effect one's career in programming? Or is the knowledge in this subject really a sufficient basis to differentiate a good and a bad programmer?

Read the article

Running a simple integration scenario using the Oracle Big Data Connectors on Hadoop/HDFS cluster

- by hamsun

Between the elephant ( the tradional image of the Hadoop framework) and the Oracle Iron Man (Big Data..) an english setter could be seen as the link to the right data Data, Data, Data, we are living in a world where data technology based on popular applications , search engines, Webservers, rich sms messages, email clients, weather forecasts and so on, have a predominant role in our life. More and more technologies are used to analyze/track our behavior, try to detect patterns, to propose us "the best/right user experience" from the Google Ad services, to Telco companies or large consumer sites (like Amazon:) ). The more we use all these technologies, the more we generate data, and thus there is a need of huge data marts and specific hardware/software servers (as the Exadata servers) in order to treat/analyze/understand the trends and offer new services to the users. Some of these "data feeds" are raw, unstructured data, and cannot be processed effectively by normal SQL queries. Large scale distributed processing was an emerging infrastructure need and the solution seemed to be the "collocation of compute nodes with the data", which in turn leaded to MapReduce parallel patterns and the development of the Hadoop framework, which is based on MapReduce and a distributed file system (HDFS) that runs on larger clusters of rather inexpensive servers. Several Oracle products are using the distributed / aggregation pattern for data calculation ( Coherence, NoSql, times ten ) so once that you are familiar with one of these technologies, lets says with coherence aggregators, you will find the whole Hadoop, MapReduce concept very similar. Oracle Big Data Appliance is based on the Cloudera Distribution (CDH), and the Oracle Big Data Connectors can be plugged on a Hadoop cluster running the CDH distribution or equivalent Hadoop clusters. In this paper, a "lab like" implementation of this concept is done on a single Linux X64 server, running an Oracle Database 11g Enterprise Edition Release 11.2.0.4.0, and a single node Apache hadoop-1.2.1 HDFS cluster, using the SQL connector for HDFS. The whole setup is fairly simple: Install on a Linux x64 server ( or virtual box appliance) an Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 server Get the Apache Hadoop distribution from: http://mir2.ovh.net/ftp.apache.org/dist/hadoop/common/hadoop-1.2.1. Get the Oracle Big Data Connectors from: http://www.oracle.com/technetwork/bdc/big-data-connectors/downloads/index.html?ssSourceSiteId=ocomen. Check the java version of your Linux server with the command: java -version java version "1.7.0_40" Java(TM) SE Runtime Environment (build 1.7.0_40-b43) Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode) Decompress the hadoop hadoop-1.2.1.tar.gz file to /u01/hadoop-1.2.1 Modify your .bash_profile export HADOOP_HOME=/u01/hadoop-1.2.1 export PATH=$PATH:$HADOOP_HOME/bin export HIVE_HOME=/u01/hive-0.11.0 export PATH=$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin (also see my sample .bash_profile) Set up ssh trust for Hadoop process, this is a mandatory step, in our case we have to establish a "local trust" as will are using a single node configuration copy the new public keys to the list of authorized keys connect and test the ssh setup to your localhost: We will run a "pseudo-Hadoop cluster", in what is called "local standalone mode", all the Hadoop java components are running in one Java process, this is enough for our demo purposes. We need to "fine tune" some Hadoop configuration files, we have to go at our $HADOOP_HOME/conf, and modify the files: core-site.xml hdfs-site.xml mapred-site.xml check that the hadoop binaries are referenced correctly from the command line by executing: hadoop -version As Hadoop is managing our "clustered HDFS" file system we have to create "the mount point" and format it , the mount point will be declared to core-site.xml as: The layout under the /u01/hadoop-1.2.1/data will be created and used by other hadoop components (MapReduce = /mapred/...) HDFS is using the /dfs/... layout structure format the HDFS hadoop file system: Start the java components for the HDFS system As an additional check, you can use the GUI Hadoop browsers to check the content of your HDFS configurations: Once our HDFS Hadoop setup is done you can use the HDFS file system to store data ( big data : )), and plug them back and forth to Oracle Databases by the means of the Big Data Connectors ( which is the next configuration step). You can create / use a Hive db, but in our case we will make a simple integration of "raw data" , through the creation of an External Table to a local Oracle instance ( on the same Linux box, we run the Hadoop HDFS one node cluster and one Oracle DB). Download some public "big data", I use the site: http://france.meteofrance.com/france/observations, from where I can get *.csv files for my big data simulations :). Here is the data layout of my example file: Download the Big Data Connector from the OTN (oraosch-2.2.0.zip), unzip it to your local file system (see picture below) Modify your environment in order to access the connector libraries , and make the following test: [oracle@dg1 bin]$./hdfs_stream Usage: hdfs_stream locationFile [oracle@dg1 bin]$ Load the data to the Hadoop hdfs file system: hadoop fs -mkdir bgtest_data hadoop fs -put obsFrance.txt bgtest_data/obsFrance.txt hadoop fs -ls /user/oracle/bgtest_data/obsFrance.txt [oracle@dg1 bg-data-raw]$ hadoop fs -ls /user/oracle/bgtest_data/obsFrance.txt Found 1 items -rw-r--r-- 1 oracle supergroup 54103 2013-10-22 06:10 /user/oracle/bgtest_data/obsFrance.txt [oracle@dg1 bg-data-raw]$hadoop fs -ls hdfs:///user/oracle/bgtest_data/obsFrance.txt Found 1 items -rw-r--r-- 1 oracle supergroup 54103 2013-10-22 06:10 /user/oracle/bgtest_data/obsFrance.txt Check the content of the HDFS with the browser UI: Start the Oracle database, and run the following script in order to create the Oracle database user, the Oracle directories for the Oracle Big Data Connector (dg1 it’s my own db id replace accordingly yours): #!/bin/bash export ORAENV_ASK=NO export ORACLE_SID=dg1 . oraenv sqlplus /nolog <<EOF CONNECT / AS sysdba; CREATE OR REPLACE DIRECTORY osch_bin_path AS '/u01/orahdfs-2.2.0/bin'; CREATE USER BGUSER IDENTIFIED BY oracle; GRANT CREATE SESSION, CREATE TABLE TO BGUSER; GRANT EXECUTE ON sys.utl_file TO BGUSER; GRANT READ, EXECUTE ON DIRECTORY osch_bin_path TO BGUSER; CREATE OR REPLACE DIRECTORY BGT_LOG_DIR as '/u01/BG_TEST/logs'; GRANT READ, WRITE ON DIRECTORY BGT_LOG_DIR to BGUSER; CREATE OR REPLACE DIRECTORY BGT_DATA_DIR as '/u01/BG_TEST/data'; GRANT READ, WRITE ON DIRECTORY BGT_DATA_DIR to BGUSER; EOF Put the following in a file named t3.sh and make it executable, hadoop jar $OSCH_HOME/jlib/orahdfs.jar \ oracle.hadoop.exttab.ExternalTable \ -D oracle.hadoop.exttab.tableName=BGTEST_DP_XTAB \ -D oracle.hadoop.exttab.defaultDirectory=BGT_DATA_DIR \ -D oracle.hadoop.exttab.dataPaths="hdfs:///user/oracle/bgtest_data/obsFrance.txt" \ -D oracle.hadoop.exttab.columnCount=7 \ -D oracle.hadoop.connection.url=jdbc:oracle:thin:@//localhost:1521/dg1 \ -D oracle.hadoop.connection.user=BGUSER \ -D oracle.hadoop.exttab.printStackTrace=true \ -createTable --noexecute then test the creation fo the external table with it: [oracle@dg1 samples]$ ./t3.sh ./t3.sh: line 2: /u01/orahdfs-2.2.0: Is a directory Oracle SQL Connector for HDFS Release 2.2.0 - Production Copyright (c) 2011, 2013, Oracle and/or its affiliates. All rights reserved. Enter Database Password:] The create table command was not executed. The following table would be created. CREATE TABLE "BGUSER"."BGTEST_DP_XTAB" ( "C1" VARCHAR2(4000), "C2" VARCHAR2(4000), "C3" VARCHAR2(4000), "C4" VARCHAR2(4000), "C5" VARCHAR2(4000), "C6" VARCHAR2(4000), "C7" VARCHAR2(4000) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY "BGT_DATA_DIR" ACCESS PARAMETERS ( RECORDS DELIMITED BY 0X'0A' CHARACTERSET AL32UTF8 STRING SIZES ARE IN CHARACTERS PREPROCESSOR "OSCH_BIN_PATH":'hdfs_stream' FIELDS TERMINATED BY 0X'2C' MISSING FIELD VALUES ARE NULL ( "C1" CHAR(4000), "C2" CHAR(4000), "C3" CHAR(4000), "C4" CHAR(4000), "C5" CHAR(4000), "C6" CHAR(4000), "C7" CHAR(4000) ) ) LOCATION ( 'osch-20131022081035-74-1' ) ) PARALLEL REJECT LIMIT UNLIMITED; The following location files would be created. osch-20131022081035-74-1 contains 1 URI, 54103 bytes 54103 hdfs://localhost:19000/user/oracle/bgtest_data/obsFrance.txt Then remove the --noexecute flag and create the external Oracle table for the Hadoop data. Check the results: The create table command succeeded. CREATE TABLE "BGUSER"."BGTEST_DP_XTAB" ( "C1" VARCHAR2(4000), "C2" VARCHAR2(4000), "C3" VARCHAR2(4000), "C4" VARCHAR2(4000), "C5" VARCHAR2(4000), "C6" VARCHAR2(4000), "C7" VARCHAR2(4000) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY "BGT_DATA_DIR" ACCESS PARAMETERS ( RECORDS DELIMITED BY 0X'0A' CHARACTERSET AL32UTF8 STRING SIZES ARE IN CHARACTERS PREPROCESSOR "OSCH_BIN_PATH":'hdfs_stream' FIELDS TERMINATED BY 0X'2C' MISSING FIELD VALUES ARE NULL ( "C1" CHAR(4000), "C2" CHAR(4000), "C3" CHAR(4000), "C4" CHAR(4000), "C5" CHAR(4000), "C6" CHAR(4000), "C7" CHAR(4000) ) ) LOCATION ( 'osch-20131022081719-3239-1' ) ) PARALLEL REJECT LIMIT UNLIMITED; The following location files were created. osch-20131022081719-3239-1 contains 1 URI, 54103 bytes 54103 hdfs://localhost:19000/user/oracle/bgtest_data/obsFrance.txt This is the view from the SQL Developer: and finally the number of lines in the oracle table, imported from our Hadoop HDFS cluster SQL select count(*) from "BGUSER"."BGTEST_DP_XTAB"; COUNT(*) ---------- 1151 In a next post we will integrate data from a Hive database, and try some ODI integrations with the ODI Big Data connector. Our simplistic approach is just a step to show you how these unstructured data world can be integrated to Oracle infrastructure. Hadoop, BigData, NoSql are great technologies, they are widely used and Oracle is offering a large integration infrastructure based on these services. Oracle University presents a complete curriculum on all the Oracle related technologies: NoSQL: Introduction to Oracle NoSQL Database Using Oracle NoSQL Database Big Data: Introduction to Big Data Oracle Big Data Essentials Oracle Big Data Overview Oracle Data Integrator: Oracle Data Integrator 12c: New Features Oracle Data Integrator 11g: Integration and Administration Oracle Data Integrator: Administration and Development Oracle Data Integrator 11g: Advanced Integration and Development Oracle Coherence 12c: Oracle Coherence 12c: New Features Oracle Coherence 12c: Share and Manage Data in Clusters Oracle Coherence 12c: Oracle GoldenGate 11g: Fundamentals for Oracle Oracle GoldenGate 11g: Fundamentals for SQL Server Oracle GoldenGate 11g Fundamentals for Oracle Oracle GoldenGate 11g Fundamentals for DB2 Oracle GoldenGate 11g Fundamentals for Teradata Oracle GoldenGate 11g Fundamentals for HP NonStop Oracle GoldenGate 11g Management Pack: Overview Oracle GoldenGate 11g Troubleshooting and Tuning Oracle GoldenGate 11g: Advanced Configuration for Oracle Other Resources: Apache Hadoop : http://hadoop.apache.org/ is the homepage for these technologies. "Hadoop Definitive Guide 3rdEdition" by Tom White is a classical lecture for people who want to know more about Hadoop , and some active "googling " will also give you some more references. About the author: Eugene Simos is based in France and joined Oracle through the BEA-Weblogic Acquisition, where he worked for the Professional Service, Support, end Education for major accounts across the EMEA Region. He worked in the banking sector, ATT, Telco companies giving him extensive experience on production environments. Eugen currently specializes in Oracle Fusion Middleware teaching an array of courses on Weblogic/Webcenter, Content,BPM /SOA/Identity-Security/GoldenGate/Virtualisation/Unified Comm Suite) throughout the EMEA region.

Read the article

Microsoft Test Manager error in displaying test steps caused by malware

- by terje

Sometimes the tool is blamed for errors which are not the fault of the tool – this is one such story. It was however, not so easy to get to the bottom of it, so I hope sharing this story can help some others. One of our test developers started to get this message inside the test steps part of a test case in the MTM. saying “Could not load file or assembly ‘0 bytes from System, Version=4.0.0.0,……..” The same error came up inside Visual Studio when we opened a test case there. Then we noted a similar error on another piece of software – this error: A System.BadImageFormatException, and same message as above, but just for framework 2.0. We found this description which pointed to a malware problem (See bottom of that post), that is a fake anti-spyware program called “Additional Guard”. We checked the computer in question using Malwarebytes Anti-Malware tool. It found and cleaned out 753 registry keys!! After this cleanup operation the error was gone. This is a great tool ! The “Additional Guard” program had been inadvertently installed, and then uninstalled afterwards, but the corrupted keys were of course not removed. We also noted that this computer had full corporate virus scanning and malware protection, but still this nasty little thing still slipped through. Technorati Tags: Malware,BadImageFormatException,Microsoft Test Manager,Malwarebytes

Read the article

The fastest way to resize images from ASP.NET. And it’s (more) supported-ish.

- by Bertrand Le Roy

I’ve shown before how to resize images using GDI, which is fairly common but is explicitly unsupported because we know of very real problems that this can cause. Still, many sites still use that method because those problems are fairly rare, and because most people assume it’s the only way to get the job done. Plus, it works in medium trust. More recently, I’ve shown how you can use WPF APIs to do the same thing and get JPEG thumbnails, only 2.5 times faster than GDI (even now that GDI really ultimately uses WIC to read and write images). The boost in performance is great, but it comes at a cost, that you may or may not care about: it won’t work in medium trust. It’s also just as unsupported as the GDI option. What I want to show today is how to use the Windows Imaging Components from ASP.NET APIs directly, without going through WPF. The approach has the great advantage that it’s been tested and proven to scale very well. The WIC team tells me you should be able to call support and get answers if you hit problems. Caveats exist though. First, this is using interop, so until a signed wrapper sits in the GAC, it will require full trust. Second, the APIs have a very strong smell of native code and are definitely not .NET-friendly. And finally, the most serious problem is that older versions of Windows don’t offer MTA support for image decoding. MTA support is only available on Windows 7, Vista and Windows Server 2008. But on 2003 and XP, you’ll only get STA support. that means that the thread safety that we so badly need for server applications is not guaranteed on those operating systems. To make it work, you’d have to spin specialized threads yourself and manage the lifetime of your objects, which is outside the scope of this article. We’ll assume that we’re fine with al this and that we’re running on 7 or 2008 under full trust. Be warned that the code that follows is not simple or very readable. This is definitely not the easiest way to resize an image in .NET. Wrapping native APIs such as WIC in a managed wrapper is never easy, but fortunately we won’t have to: the WIC team already did it for us and released the results under MS-PL. The InteropServices folder, which contains the wrappers we need, is in the WicCop project but I’ve also included it in the sample that you can download from the link at the end of the article. In order to produce a thumbnail, we first have to obtain a decoding frame object that WIC can use. Like with WPF, that object will contain the command to decode a frame from the source image but won’t do the actual decoding until necessary. Getting the frame is done by reading the image bytes through a special WIC stream that you can obtain from a factory object that we’re going to reuse for lots of other tasks: var photo = File.ReadAllBytes(photoPath); var factory = (IWICComponentFactory)new WICImagingFactory(); var inputStream = factory.CreateStream(); inputStream.InitializeFromMemory(photo, (uint)photo.Length); var decoder = factory.CreateDecoderFromStream( inputStream, null, WICDecodeOptions.WICDecodeMetadataCacheOnLoad); var frame = decoder.GetFrame(0); We can read the dimensions of the frame using the following (somewhat ugly) code: uint width, height; frame.GetSize(out width, out height); This enables us to compute the dimensions of the thumbnail, as I’ve shown in previous articles. We now need to prepare the output stream for the thumbnail. WIC requires a special kind of stream, IStream (not implemented by System.IO.Stream) and doesn’t directlyunderstand .NET streams. It does provide a number of implementations but not exactly what we need here. We need to output to memory because we’ll want to persist the same bytes to the response stream and to a local file for caching. The memory-bound version of IStream requires a fixed-length buffer but we won’t know the length of the buffer before we resize. To solve that problem, I’ve built a derived class from MemoryStream that also implements IStream. The implementation is not very complicated, it just delegates the IStream methods to the base class, but it involves some native pointer manipulation. Once we have a stream, we need to build the encoder for the output format, which could be anything that WIC supports. For web thumbnails, our only reasonable options are PNG and JPEG. I explored PNG because it’s a lossless format, and because WIC does support PNG compression. That compression is not very efficient though and JPEG offers good quality with much smaller file sizes. On the web, it matters. I found the best PNG compression option (adaptive) to give files that are about twice as big as 100%-quality JPEG (an absurd setting), 4.5 times bigger than 95%-quality JPEG and 7 times larger than 85%-quality JPEG, which is more than acceptable quality. As a consequence, we’ll use JPEG. The JPEG encoder can be prepared as follows: var encoder = factory.CreateEncoder( Consts.GUID_ContainerFormatJpeg, null); encoder.Initialize(outputStream, WICBitmapEncoderCacheOption.WICBitmapEncoderNoCache); The next operation is to create the output frame: IWICBitmapFrameEncode outputFrame; var arg = new IPropertyBag2[1]; encoder.CreateNewFrame(out outputFrame, arg); Notice that we are passing in a property bag. This is where we’re going to specify our only parameter for encoding, the JPEG quality setting: var propBag = arg[0]; var propertyBagOption = new PROPBAG2[1]; propertyBagOption[0].pstrName = "ImageQuality"; propBag.Write(1, propertyBagOption, new object[] { 0.85F }); outputFrame.Initialize(propBag); We can then set the resolution for the thumbnail to be 96, something we weren’t able to do with WPF and had to hack around: outputFrame.SetResolution(96, 96); Next, we set the size of the output frame and create a scaler from the input frame and the computed dimensions of the target thumbnail: outputFrame.SetSize(thumbWidth, thumbHeight); var scaler = factory.CreateBitmapScaler(); scaler.Initialize(frame, thumbWidth, thumbHeight, WICBitmapInterpolationMode.WICBitmapInterpolationModeFant); The scaler is using the Fant method, which I think is the best looking one even if it seems a little softer than cubic (zoomed here to better show the defects): Cubic Fant Linear Nearest neighbor We can write the source image to the output frame through the scaler: outputFrame.WriteSource(scaler, new WICRect { X = 0, Y = 0, Width = (int)thumbWidth, Height = (int)thumbHeight }); And finally we commit the pipeline that we built and get the byte array for the thumbnail out of our memory stream: outputFrame.Commit(); encoder.Commit(); var outputArray = outputStream.ToArray(); outputStream.Close(); That byte array can then be sent to the output stream and to the cache file. Once we’ve gone through this exercise, it’s only natural to wonder whether it was worth the trouble. I ran this method, as well as GDI and WPF resizing over thirty twelve megapixel images for JPEG qualities between 70% and 100% and measured the file size and time to resize. Here are the results: Size of resized images Time to resize thirty 12 megapixel images Not much to see on the size graph: sizes from WPF and WIC are equivalent, which is hardly surprising as WPF calls into WIC. There is just an anomaly for 75% for WPF that I noted in my previous article and that disappears when using WIC directly. But overall, using WPF or WIC over GDI represents a slight win in file size. The time to resize is more interesting. WPF and WIC get similar times although WIC seems to always be a little faster. Not surprising considering WPF is using WIC. The margin of error on this results is probably fairly close to the time difference. As we already knew, the time to resize does not depend on the quality level, only the size does. This means that the only decision you have to make here is size versus visual quality. This third approach to server-side image resizing on ASP.NET seems to converge on the fastest possible one. We have marginally better performance than WPF, but with some additional peace of mind that this approach is sanctioned for server-side usage by the Windows Imaging team. It still doesn’t work in medium trust. That is a problem and shows the way for future server-friendly managed wrappers around WIC. The sample code for this article can be downloaded from: http://weblogs.asp.net/blogs/bleroy/Samples/WicResize.zip The benchmark code can be found here (you’ll need to add your own images to the Images directory and then add those to the project, with content and copy if newer in the properties of the files in the solution explorer): http://weblogs.asp.net/blogs/bleroy/Samples/WicWpfGdiImageResizeBenchmark.zip WIC tools can be downloaded from: http://code.msdn.microsoft.com/wictools To conclude, here are some of the resized thumbnails at 85% fant:

Read the article

Optimizing Solaris 11 SHA-1 on Intel Processors

- by danx

SHA-1 is a "hash" or "digest" operation that produces a 160 bit (20 byte) checksum value on arbitrary data, such as a file. It is intended to uniquely identify text and to verify it hasn't been modified. Max Locktyukhin and others at Intel have improved the performance of the SHA-1 digest algorithm using multiple techniques. This code has been incorporated into Solaris 11 and is available in the Solaris Crypto Framework via the libmd(3LIB), the industry-standard libpkcs11(3LIB) library, and Solaris kernel module sha1. The optimized code is used automatically on systems with a x86 CPU supporting SSSE3 (Intel Supplemental SSSE3). Intel microprocessor architectures that support SSSE3 include Nehalem, Westmere, Sandy Bridge microprocessor families. Further optimizations are available for microprocessors that support AVX (such as Sandy Bridge). Although SHA-1 is considered obsolete because of weaknesses found in the SHA-1 algorithm—NIST recommends using at least SHA-256, SHA-1 is still widely used and will be with us for awhile more. Collisions (the same SHA-1 result for two different inputs) can be found with moderate effort. SHA-1 is used heavily though in SSL/TLS, for example. And SHA-1 is stronger than the older MD5 digest algorithm, another digest option defined in SSL/TLS. Optimizations Review SHA-1 operates by reading an arbitrary amount of data. The data is read in 512 bit (64 byte) blocks (the last block is padded in a specific way to ensure it's a full 64 bytes). Each 64 byte block has 80 "rounds" of calculations (consisting of a mixture of "ROTATE-LEFT", "AND", and "XOR") applied to the block. Each round produces a 32-bit intermediate result, called W[i]. Here's what each round operates: The first 16 rounds, rounds 0 to 15, read the 512 bit block 32 bits at-a-time. These 32 bits is used as input to the round. The remaining rounds, rounds 16 to 79, use the results from the previous rounds as input. Specifically for round i it XORs the results of rounds i-3, i-8, i-14, and i-16 and rotates the result left 1 bit. The remaining calculations for the round is a series of AND, XOR, and ROTATE-LEFT operators on the 32-bit input and some constants. The 32-bit result is saved as W[i] for round i. The 32-bit result of the final round, W[79], is the SHA-1 checksum. Optimization: Vectorization The first 16 rounds can be vectorized (computed in parallel) because they don't depend on the output of a previous round. As for the remaining rounds, because of step 2 above, computing round i depends on the results of round i-3, W[i-3], one can vectorize 3 rounds at-a-time. Max Locktyukhin found through simple factoring, explained in detail in his article referenced below, that the dependencies of round i on the results of rounds i-3, i-8, i-14, and i-16 can be replaced instead with dependencies on the results of rounds i-6, i-16, i-28, and i-32. That is, instead of initializing intermediate result W[i] with: W[i] = (W[i-3] XOR W[i-8] XOR W[i-14] XOR W[i-16]) ROTATE-LEFT 1 Initialize W[i] as follows: W[i] = (W[i-6] XOR W[i-16] XOR W[i-28] XOR W[i-32]) ROTATE-LEFT 2 That means that 6 rounds could be vectorized at once, with no additional calculations, instead of just 3! This optimization is independent of Intel or any other microprocessor architecture, although the microprocessor has to support vectorization to use it, and exploits one of the weaknesses of SHA-1. Optimization: SSSE3 Intel SSSE3 makes use of 16 %xmm registers, each 128 bits wide. The 4 32-bit inputs to a round, W[i-6], W[i-16], W[i-28], W[i-32], all fit in one %xmm register. The following code snippet, from Max Locktyukhin's article, converted to ATT assembly syntax, computes 4 rounds in parallel with just a dozen or so SSSE3 instructions: movdqa W_minus_04, W_TMP pxor W_minus_28, W // W equals W[i-32:i-29] before XOR // W = W[i-32:i-29] ^ W[i-28:i-25] palignr $8, W_minus_08, W_TMP // W_TMP = W[i-6:i-3], combined from // W[i-4:i-1] and W[i-8:i-5] vectors pxor W_minus_16, W // W = (W[i-32:i-29] ^ W[i-28:i-25]) ^ W[i-16:i-13] pxor W_TMP, W // W = (W[i-32:i-29] ^ W[i-28:i-25] ^ W[i-16:i-13]) ^ W[i-6:i-3]) movdqa W, W_TMP // 4 dwords in W are rotated left by 2 psrld $30, W // rotate left by 2 W = (W >> 30) | (W << 2) pslld $2, W_TMP por W, W_TMP movdqa W_TMP, W // four new W values W[i:i+3] are now calculated paddd (K_XMM), W_TMP // adding 4 current round's values of K movdqa W_TMP, (WK(i)) // storing for downstream GPR instructions to read A window of the 32 previous results, W[i-1] to W[i-32] is saved in memory on the stack. This is best illustrated with a chart. Without vectorization, computing the rounds is like this (each "R" represents 1 round of SHA-1 computation): RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR With vectorization, 4 rounds can be computed in parallel: RRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRR Optimization: AVX The new "Sandy Bridge" microprocessor architecture, which supports AVX, allows another interesting optimization. SSSE3 instructions have two operands, a input and an output. AVX allows three operands, two inputs and an output. In many cases two SSSE3 instructions can be combined into one AVX instruction. The difference is best illustrated with an example. Consider these two instructions from the snippet above: pxor W_minus_16, W // W = (W[i-32:i-29] ^ W[i-28:i-25]) ^ W[i-16:i-13] pxor W_TMP, W // W = (W[i-32:i-29] ^ W[i-28:i-25] ^ W[i-16:i-13]) ^ W[i-6:i-3]) With AVX they can be combined in one instruction: vpxor W_minus_16, W, W_TMP // W = (W[i-32:i-29] ^ W[i-28:i-25] ^ W[i-16:i-13]) ^ W[i-6:i-3]) This optimization is also in Solaris, although Sandy Bridge-based systems aren't widely available yet. As an exercise for the reader, AVX also has 256-bit media registers, %ymm0 - %ymm15 (a superset of 128-bit %xmm0 - %xmm15). Can %ymm registers be used to parallelize the code even more? Optimization: Solaris-specific In addition to using the Intel code described above, I performed other minor optimizations to the Solaris SHA-1 code: Increased the digest(1) and mac(1) command's buffer size from 4K to 64K, as previously done for decrypt(1) and encrypt(1). This size is well suited for ZFS file systems, but helps for other file systems as well. Optimized encode functions, which byte swap the input and output data, to copy/byte-swap 4 or 8 bytes at-a-time instead of 1 byte-at-a-time. Enhanced the Solaris mdb(1) and kmdb(1) debuggers to display all 16 %xmm and %ymm registers (mdb "$x" command). Previously they only displayed the first 8 that are available in 32-bit mode. Can't optimize if you can't debug :-). Changed the SHA-1 code to allow processing in "chunks" greater than 2 Gigabytes (64-bits) Performance I measured performance on a Sun Ultra 27 (which has a Nehalem-class Xeon 5500 Intel W3570 microprocessor @3.2GHz). Turbo mode is disabled for consistent performance measurement. Graphs are better than words and numbers, so here they are: The first graph shows the Solaris digest(1) command before and after the optimizations discussed here, contained in libmd(3LIB). I ran the digest command on a half GByte file in swapfs (/tmp) and execution time decreased from 1.35 seconds to 0.98 seconds. The second graph shows the the results of an internal microbenchmark that uses the Solaris libpkcs11(3LIB) library. The operations are on a 128 byte buffer with 10,000 iterations. The results show operations increased from 320,000 to 416,000 operations per second. Finally the third graph shows the results of an internal kernel microbenchmark that uses the Solaris /kernel/crypto/amd64/sha1 module. The operations are on a 64Kbyte buffer with 100 iterations. third graph shows the results of an internal kernel microbenchmark that uses the Solaris /kernel/crypto/amd64/sha1 module. The operations are on a 64Kbyte buffer with 100 iterations. The results show for 1 kernel thread, operations increased from 410 to 600 MBytes/second. For 8 kernel threads, operations increase from 1540 to 1940 MBytes/second. Availability This code is in Solaris 11 FCS. It is available in the 64-bit libmd(3LIB) library for 64-bit programs and is in the Solaris kernel. You must be running hardware that supports Intel's SSSE3 instructions (for example, Intel Nehalem, Westmere, or Sandy Bridge microprocessor architectures). The easiest way to determine if SSSE3 is available is with the isainfo(1) command. For example, nehalem $ isainfo -v $ isainfo -v 64-bit amd64 applications sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu If the output also shows "avx", the Solaris executes the even-more optimized 3-operand AVX instructions for SHA-1 mentioned above: sandybridge $ isainfo -v 64-bit amd64 applications avx xsave pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications avx xsave pclmulqdq aes sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu No special configuration or setup is needed to take advantage of this code. Solaris libraries and kernel automatically determine if it's running on SSSE3 or AVX-capable machines and execute the correctly-tuned code for that microprocessor. Summary The Solaris 11 Crypto Framework, via the sha1 kernel module and libmd(3LIB) and libpkcs11(3LIB) libraries, incorporated a useful SHA-1 optimization from Intel for SSSE3-capable microprocessors. As with other Solaris optimizations, they come automatically "under the hood" with the current Solaris release. References "Improving the Performance of the Secure Hash Algorithm (SHA-1)" by Max Locktyukhin (Intel, March 2010). The source for these SHA-1 optimizations used in Solaris "SHA-1", Wikipedia Good overview of SHA-1 FIPS 180-1 SHA-1 standard (FIPS, 1995) NIST Comments on Cryptanalytic Attacks on SHA-1 (2005, revised 2006)

Search Results

Search found 4304 results on 173 pages for 'bytes'.

Page 113/173 | < Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120 | Next Page >

- by Mikey

- by binaryorganic

- by Gomibushi

- by wdow88

- by allenwei

- by Saumitra

- by gundam

- by user27388

- by davebug

- by Aseques

- by r.s.mahanti

- by Bruno Reis

- by BVernon

- by psmith

- by leden

- by Vaccano

- by alecco

- by Soham

- by Oscar Alejos

- by Gopinath

- by Vamsi Emani

- by hamsun

- by terje

- by Bertrand Le Roy

- by danx

< Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120 | Next Page >