Search Results

Search found 9062 results on 363 pages for 'big'.

Page 79/363 | < Previous Page | 75 76 77 78 79 80 81 82 83 84 85 86  | Next Page >

  • std::ifstream buffer caching

    - by ledokol
    Hello everybody, In my application I'm trying to merge sorted files (keeping them sorted of course), so I have to iterate through each element in both files to write the minimal to the third one. This works pretty much slow on big files, as far as I don't see any other choice (the iteration has to be done) I'm trying to optimize file loading. I can use some amount of RAM, which I can use for buffering. I mean instead of reading 4 bytes from both files every time I can read once something like 100Mb and work with that buffer after that, until there will be no element in buffer, then I'll refill the buffer again. But I guess ifstream is already doing that, will it give me more performance and is there any reason? If fstream does, maybe I can change size of that buffer? added My current code looks like that (pseudocode) // this is done in loop int i1 = input1.read_integer(); int i2 = input2.read_integer(); if (!input1.eof() && !input2.eof()) { if (i1 < i2) { output.write(i1); input2.seek_back(sizeof(int)); } else input1.seek_back(sizeof(int)); output.write(i2); } } else { if (input1.eof()) output.write(i2); else if (input2.eof()) output.write(i1); } What I don't like here is seek_back - I have to seek back to previous position as there is no way to peek 4 bytes too much reading from file if one of the streams is in EOF it still continues to check that stream instead of putting contents of another stream directly to output, but this is not a big issue, because chunk sizes are almost always equal. Can you suggest improvement for that? Thanks.

    Read the article

  • MySQL running on an EC2 m1.small instance has high load but low memory usage, possible resolutions?

    - by Tosh
    I have a MySQL server 5.0.75 Ubuntu, on an m1.small instance running on Amazon's EC2 as part of an application. During peak usage the server load will rise very high, while the memory usage stays low and the application server is no longer responsive since it's waiting for query results. The application server has only 5-8 apache processes running (mod_perl processes). The data directory uses only 140MB of data so the MyIsam tables aren't very big. The queries are pretty complicated with some big joins being performed, and the application makes a lot of queries. mysqltuner reports everything OK except "Maximum possible memory usage: 1.7G (99% of installed RAM)" but I'm nowhere close to using that. My question is, where should I be looking to fix this? Is this something that can be tuned away, or do I just need a larger instance/server? Googling indicates either or also upgrading MySQL server. Any pointers in the right direction would be greatly appreciated, thanks! EDIT: I just discovered this in my slow queries log: # Time: 101116 11:17:00 # User@Host: user[pass] @ [host] # Query_time: 4063 Lock_time: 1035 Rows_sent: 0 Rows_examined: 19960174 SELECT * FROM contacts WHERE contacts.contact_id IN (SELECT external_id FROM contact_relations WHERE external_table = 'contacts' AND contact_id IN (SELECT contact_id FROM contacts WHERE (company_name like '%%butan%%%' OR country like '%%butan%%%' OR city like '%%butan%%%' OR email1 like '%%butan%%%') AND (company_name is not null and company_name != ''))); Which actually brings up a different but related question: If I have a contact table containing: John Smith,The Fun Factory,555-1212,[email protected] What's the best way to search for that record using "factory" as a search key? Fulltext rarely seems to find items in the middle of a word, for example "actor" should bring up "Factory"

    Read the article

  • Subquery with multiple results combined into a single field?

    - by Todd
    Assume I have these tables, from which i need to display search results in a browser: Table: Containers id | name 1 Big Box 2 Grocery Bag 3 Envelope 4 Zip Lock Table: Sale id | date | containerid 1 20100101 1 2 20100102 2 3 20091201 3 4 20091115 4 Table: Items id | name | saleid 1 Barbie Doll 1 2 Coin 3 3 Pop-Top 4 4 Barbie Doll 2 5 Coin 4 I need output that looks like this: itemid itemname saleids saledates containerids containertypes 1 Barbie Doll 1,2 20100101,20100102 1,2 Big Box, Grocery Bag 2 Coin 3,4 20091201,20091115 3,4 Envelope, Zip Lock 3 Pop-Top 4 20091115 4 Zip Lock The important part is that each item type only gets one record/row in the return on the screen. I accomplished this in the past by returning multiple rows of the same item and using a scripting language to limit the output. However, this makes the ui overly complicated and loopy. So, I'm hoping I can get the database to spit out only as many records as there are rows to display. This example may be a bit extreme because of the 2 joins needed to get to the container from the item (through the sale table). I'd be happy for just an example query that outputs this: itemid itemname saleids saledates 1 Barbie Doll 1,2 20100101,20100102 2 Coin 3,4 20091201,20091115 3 Pop-Top 4 20091115 I can only return a single result in a subquery, so I'm not sure how to do this.

    Read the article

  • Why can't IE6 shows semi transparent png8 files with alpha filter ?

    - by vvo
    -- read the whole question before answering -- Hi, i work on a big website that had a lot (45000+) of png24 images (with semi transparency). I converted them to png8 and it works very well (a big help on page load time...). The thing is i had to keep png24 files for ie6 users (with alpha filter to have semi transparent pixels) because we all know that we can't use png 8 semi transparent images in IE6 : either the semi transparent pixels will be opaque or completely transparent. I tried to use the alpha image loader filter with png8 images but it just don't work, the pixels are still opaque/completely transparent, no semi transparency. What's the reason it's not working ? Is there a difference for IE when dealing with semi transparent pixels from a png24 or from a png8 ? I couldn't find any information on msdn websites or on stackoverflow... This is crazy... ! DISCLAIMER : i'm not searching for a f**ckin fix IE6 png or sh*t like that, i already know alpha image loader or htc techniques etc, theses all works well with PNG24 files but doesn't work with png8 files.

    Read the article

  • Unicode version of base64 encoding/ decoding

    - by Yan Cheng CHEOK
    I am using base64 encoding/decoding from http://www.adp-gmbh.ch/cpp/common/base64.html It works pretty well with the following code. const std::string s = "I Am A Big Fat Cat" ; std::string encoded = base64_encode(reinterpret_cast<const unsigned char*>(s.c_str()), s.length()); std::string decoded = base64_decode(encoded); std::cout << _T("encoded: ") << encoded << std::endl; std::cout << _T("decoded: ") << decoded << std::endl; However, when comes to unicode namespace std { #ifdef _UNICODE typedef wstring tstring; #else typedef string tstring; #endif } const std::tstring s = _T("I Am A Big Fat Cat"); How can I still make use of the above function? Merely changing std::string base64_encode(unsigned TCHAR const* , unsigned int len); std::tstring base64_decode(std::string const& s); will not work correctly. (I expect base64_encode to return ASCII. Hence, std::string should be used instead of std::tstring)

    Read the article

  • How to accommodate for the next iPhones totally different screen resolution?

    - by mystify
    This is a programming question! Read on before you vote to close! According to Gizmodo, the next iPhone will have a new screen resolution: The 3.5-inch screen has a resolution of 960?×?640 pixels This little detail affects our apps in a heavy way. Most of the demo apps on the net have one thing in common: They position views in the believe that the screen has a fixed size of 320 x 480 pixels. So what most -if not all- developers do is: They designed everything in such a way, that a touchable area is -for example- 50 x 50 pixels big. Just enough to tap it. Things have been positioned relative to the upper left, to reach a specific position on screen - let's say the center, or somewhere at the bottom. So the big question is: How will the developers compensate their layout and graphics? Are there already solutions which can be used to calculate coordinates and sizes in a normalized manner, which then appear to be exactly the same when viewing them on a screen of any resolution, assuming at least that the aspect ration won't change? This is community wiki. Just add anything that you think is relevant to this huge problem (constant screen res was one of the main reasons why I didn't go for Android!!).

    Read the article

  • What data stucture should I use for BigInt class

    - by user1086004
    I would like to implement a BigInt class which will be able to handle really big numbers. I want only to add and multiply numbers, however the class should also handle negative numbers. I wanted to represent the number as a string, but there is a big overhead with converting string to int and back for adding. I want to implement addition as on the high school, add corresponding order and if the result is bigger than 10, add the carry to next order. Then I thought that it would be better to handle it as a array of unsigned long long int and keep the sign separated by bool. With this I'm afraid of size of the int, as C++ standard as far as I know guarantees only that int < float < double. Correct me if I'm wrong. So when I reach some number I should move in array forward and start adding number to the next array position. Is there any data structure that is appropriate or better for this? Thanks in advance.

    Read the article

  • Why isn't obliterate an essential feature of Subversion?

    - by Dimitri C.
    For some years now, I'm waiting for Subversion to feature a "delete permanently" (obliterate) function. I hesitate to make the transition to Subversion (coming from Visual SourceSafe :p), because I think this is an essential feature, as otherwise I'd expect the repository to grow unstopably. However, for one reason or the other, the feature gets postponed over and over again. So I begin wondering if there is some other feature or workaround which makes the obliterate function dispensable. What do you do when you want to shrink the SVN central repository? Example 1: I check in a large third party library, and after a few weeks I realize it is not suited for my needs. I don't want that to store and backup that large amount of data forever. Example 2: I have 10 versions of 10 big third party libraries in the repository, but I only use the latest versions. Example 3: I accidentally checked in sensitive information (as suggested by John). Example 4: I accidentally checked in some big files that were never meant to be put in the repository.

    Read the article

  • Read huge free text docs in one file for lucene indexing

    - by Jun
    I have heaps of free text news docs in one big file. The structure of each news doc is like: (Header line) Category, Doc1, Date (day, month, year) (body text) ... ... ... (Header line) Category, Doc2, Date (day, month, year) (body text) ... ... ... If I extract each doc from the big file, it costs too much time and not efficient. Therefore, I decide to read the file line by line and feed information to lucene the same time. I write c# code to index each doc to lucene like: Streamreader sr = new Streamreader(file); string line = ""; while((line = sr.ReadLine()) != null) { How can I tell this line is a doc header line from text line and get the metadata and all the text lines of a doc for lucene to index. Also, the text is read by OCR which can not give correct line-separating. Captions are mixed with content text iterate the process till the end of the file } with thanks

    Read the article

  • Is it a wise decision to go from dev to third line/Tier 3 dev support?

    - by dotnetdev
    Hi, I am an experienced, mid-level developer. However, I recently spotted a job for a company which is small but has a lot of emphasis on training(beyond the basic technical training, but also mentoring, leadership training, etc). The role is 3rd line so still very technical. It's in app support so it's post implementation development rather than pure out-and-out development like I do now (or don't, as the senior devs do all of the interesting work). However, and this is the question - is this sort of career move common? Also, wouldn't a tech support role be a big shock to the system because I've never dealt with customers? I therefore think it's a bad move? Working in dev, I am used to the lack of customer contact and it is all filtered through by my manager. But in tech support, contacting customers/rude customers could be scary. I don't mind fixing other people's mistakes (better than me making mistakes!) and doing post-implementation dev for production systems (will give me a lot of discipline), and I do get bored sitting in the same place looking/talking to the same people in suits (I work in a corporate environment). The company puts A LOT of emphasis on training and prospects, which I don't get in the current (big) company I work at. Any advice on how to handle tech support is appreciated! Thanks

    Read the article

  • Prevent WPF control from expanding beyond viewable area

    - by Dan dot net
    I have an Items Control in my user control with a scroll viewer around it for when it gets too big (Too big being content is larger than the viewable area of the user control). The problem is that the grid that it is all in just keeps expanding so that the scroll viewer never kicks in (unless I specify an exact height for the grid). See code below and thanks in advance. <Grid HorizontalAlignment="Stretch" VerticalAlignment="Stretch" Height="300px"> <Grid.RowDefinitions> <RowDefinition Height="*" /> </Grid.RowDefinitions> <GroupBox FontWeight="Bold" Header="Tables" Padding="2"> <ScrollViewer> <ItemsControl FontWeight="Normal" ItemsSource="{Binding Path=AvailableTables}"> <ItemsControl.ItemTemplate> <DataTemplate> <CheckBox Content="{Binding Path=DisplayName}" IsChecked="{Binding Path=IsSelected}" Margin="2,3.5" /> </DataTemplate> </ItemsControl.ItemTemplate> </ItemsControl> </ScrollViewer> </GroupBox> </Grid> I would like to not specify the height.

    Read the article

  • Senior programming 'guru' who can't program - should I find a different career?

    - by confess
    Background: According to my resume I'm supposed to be pretty good at programming. I've worked on a ton of big projects at big companies over many years. When I go for an interview and someone looks at my resume they immediately assume I really know what I'm talking about. I generally communicate well, present myself well, know the 'jargon' and know a lot about technology at a high level, which makes matters worse because after talking to me for a while an interviewer really believes that what my resume says is probably true. The Problem: The problem arises when someone asks me to code something. I choke. As a programmer I have almost no capacity to come up with creative solutions of my own. I can't think through solutions to a programming problem the way good programmers are usually able to. I read questions on StackOverflow and the answer is obvious to me after I read other people's answers but if I am the first person to look at a question with no hints from anyone else I usually don't know where to start. At work it's the same thing. I'm fine if I'm correcting other people's code. I can identify the source of a bug quicker than anyone I work with. But if you ask me to sit down and code up a new application from scratch I will spend ten times longer than programmers who are much more junior than me. Question: Now that I am looking for work this is raising its ugly head in interview situations and making me feel desperately that I'm in the wrong career. I don't know if this problem is incompetence, laziness or some combination of these. Does anyone have any ideas about what I might be dealing with - are there books or exercises that could help me with this basic problem?

    Read the article

  • Best way to import a pack or "system" of new classes??

    - by Joe Blow
    Here's an Advanced question for Advanced developers. So I've written a largish "subsystem". It is essentially a UIViewController called CleverViewController which is a UIViewController. Now, there are a large number of supporting classes (about ten) that do the hard work: perform math functions, image processing, purely logical functions, build images or what have you with thousands of lines of code. (To do this, I simply started a new XCode project / app "Scratchpad" which does little other than load and launch the CleverViewController. So currently it works as an app, which launches CleverViewController. The ten or so classes I mention that are part of the "subsystem" simply sit there in that project/app.) So now, we will use CleverViewController, the new technology generally, in various apps. (Or perhaps friends would want to use it, etc.) What's the best way to "do" this? Have I screwed everything up, and really it should just be ONE (pretty big) class rather than a dozen classes? (I could understand that then as I would simply add that new (big) class where needed, like adding any other class.) Do I have to make a "framework" like the Apple frameworks? (If so, what the hell are they, how do you do it, etc?!?) In fact, do you just have to lamely include all of the dozen classes and that's that (obviously perhaps putting them in a grouped subfolder). What about all the headers and so on? (Currently I just have the dozen includes in the pch file of the scratchpad project.) Shouldn't it be easy to "maintain" this "subsystem" separately and so on? I'm afraid I know nothing about this: if the answer is obvious, hit me over the head and let me know. Thank you for any info on this !

    Read the article

  • Change object on client side or on server side

    - by Polina Feterman
    I'm not sure what is the best practice. I have some big and complex objects (NOT flat). In that object I have many related objects - for example Invoice is the main class and one of it's properties is invoiceSupervisor - a big class by it's own called User. User can also be not flat and have department property - also an object called Department. For example I want create new Invoice. First way: I can present to client several fields to fill in. Some of them will be combos that I will need to fill with available values. For example available invoiceSupervisors. Then all the chosen values I can send to server and on server I can create new Invoice and assign all chosen values to that new Invoice. Then I will need to assign new supervisor I will pull the chosen User by id that user picked up on server from combobox. I might do some verification on the User such as does the user applicable to be invoice supervisor. Then I will assign the User object to invoiceSupervisor. Then after filling all properties I will save the new invoice. Second way: In the beginning I can call to server to get a new Invoice. Then on client I can fill all chosen values , for example I can call to server to get new User object and then fill it's id from combobox and assign the User as invoiceSupervisor. After filling the Invoice object on client I can send it to server and then the server will save the new invoice. Before saving server can run some validations as well. So what is the best approach - to make the object on client and send it to server or to collect all values from client and to make a new object on server using those values ?

    Read the article

  • What is the n in O(n) when comparing sorting algorithms?

    - by Mumfi
    The question is rather simple, but I just can't find a good enough answer. I've taken a look at the most upvoted question regarding the Big-Oh notation, namely this: Plain English explanation of Big O It says there that: For example, sorting algorithms are typically compared based on comparison operations (comparing two nodes to determine their relative ordering). Now let's consider the simple bubble sort algorithm: for (int i = arr.length - 1; i > 0 ; i--) { for (int j = 0; j<i; j++) { if (arr[j] > arr[j+1]) { switchPlaces(...) } } } I know that worst case is O(n^2) and best case is O(n), but what is n exactly? If we attempt to sort an already sorted algorithm (best case), we would end up doing nothing, so why is it still O(n)? We are looping through 2 for-loops still, so if anything it should be O(n^2). n can't be the number of comparison operations, because we still compare all the elements, right? This confuses me, and I appreciate if someone could help me.

    Read the article

  • How to properly manage a complex DB structure?

    - by errr
    Let's say you have several systems using the same DB - each uses several schemes (sometimes same as the other). This structure of these schemes is somewhat very big and complicated. Now, how could you possibly manage such scheme structure? Obviously using some sort of "configuration" - the simplest would be SQL scripts, but a more reasonable solution would be XMLs which can be easily converted into SQL, or some other readable solution (for example, JPA's XMLs or Annotations). This solution though, causes a problem where you can't really tell if your configuration matches the structure of the DB schemes exactly. You can't say if those two are synchronized. Why wouldn't they? Well, in such big structure there are going to be many changes, and you won't always remember to save/commit your configuration after you've altered the schemes, or maybe you did save/commit it, but eventually didn't altered anything in the schemes and forgot to undo the changes to the configuration. More than that, another problem (not caused by the configuration, but isn't addressed by it either) is versioning. I don't see any good way of managing the DB schemes versions (say our last alteration makes 3 systems crash - not good, how to "rollback"?). And thoughts? thx.

    Read the article

  • For a 1view/scene to 2view/scene app, what application should I choose in Xcode?

    - by Tony Xu
    The question may be simple to some others, but I have been struggling with this for a while. The app I want would be like this: first scene/view with two big buttons (no toolbar item), click each one to get into two new scenes. So totally three scenes. In Xcode, what application should I choose? And in storyboard how/should I drag/draw? Thanks. Update: thanks for the link, the big-number-user. I actually read that tutorial before I asked. A little update on what I got so far: 1, I selected "single view", so there's view controller 1 (VC1) in the storyboard. 2, dragged a navigation controller (NC), and move the initial view arrow pointing to NC 3, control-drag to link NC and VC1, selected "relationship segue root view controller" when some small dialog popup. IS THIS CORRECT? 4, created two additional VC, VC3 and VC4, control-drag link each to NC. selected "push", IS THIS CORRECT? 5, in VC1, I added two buttons, showVC3 and showVC4. NOW I DON'T KNOW how to add IBAction to button showVC3 and showVC4. I tried to control-drag it to ViewController.m file @interface and @end section, but failed. What should I do next?

    Read the article

  • Array of Sentences?

    - by user1869915
    Javascript noob here.... I am trying to build a site that will help my kids read predefined sentences from a select group, then when a button is clicked it will display one of the sentences. Is an array the best option for this? For example, I have this array (below) and on the click of a button I would like one of these sentences to appear on the page. <script type="text/javascript"> Sentence = new Array() Sentence[0]='Can we go to the park.'; Sentence[1]='Where is the orange cat? Said the big black dog.'; Sentence[2]='We can make the bird fly away if we jump on something.' Sentence[3]='We can go down to the store with the dog. It is not too far away.' Sentence[4]='My big yellow cat ate the little black bird.' Sentence[5]='I like to read my book at school.' Sentence[6]='We are going to swim at the park.' </script> Again, is an array the best for this and how could I get the sentence to display? Ideally I would want the button to randomly select one of these sentences but just displaying one of them for now would help. Thanks

    Read the article

  • Extracting specific words that end with .c and .h [on hold]

    - by Alberto Mederos
    I have a very big list of file names that end with a the following: .c .h .cpp and much more. I need to extract file names that end with .c and .h How do I do that? Also, how could I add quotation marks to the beginning and end of the word, followed with a comma? For example, if I have this in the list: mi_var.c How could I extract it from a very big list, and everything else that ends in .c and replace it to have quotation marks and a comma at the end? Like this: "mi_var.c", I'm new to this, any help is greatly appreciated. Here is part of the list: gsd5t_image.c, gsd5t_image_sqif.c, proc_arm.c, proc_cortex.c, proc_k32.c, proc_k32_entry.s, proc_k32_test.c, proc_k32_test_start.s, rom_sub_functions.s, rom_sub_functions_gcc.s, sqif_jump_table.s, sqif_jump_table_gcc.s, tracker_wrapper_functions.s, vector_M0.c, ptimer.c, ptimer_arm.c, ptimer_internal.h, ptimer_internal_arm.h, ptimer_internal_k32.h, ptimer_k32.c, RstMod_if.h, drvRstMod.h, tbus.dxy, tbus_common.c, tbus_common.h, act.c, act.h, act.msgs, act_if.c, act_if.h, sat_signal_processor.c, sat_signal_processor.h, ssp.dxy, ssp.msgs, ssp_acq_handlers.c, ssp_acq_handlers.h, ssp_atx_if.c, ssp_atx_if.h, ssp_bitsync_handlers.c, ssp_bitsync_handlers.h, ssp_cohver_handlers.c, ssp_cohver_handlers.h, ssp_cwscan_handlers.c, ssp_cwscan_handlers.h, ssp_track_handlers.c, ssp_track_handlers.h, ssp_atx_if_test_sort.c, ssp_hack.c, ssp_hack.h, ssp_suite.cpp, ssp_suite.h, ssptloop.c, ssptloop.h, sss.dxy, sss.msgs, sss_atx_if.c, sss_atx_if.h, strong_signal_scan.c, So how to extract certain names?

    Read the article

  • Guide to reduce TFS database growth using the Test Attachment Cleaner

    - by terje
    Recently there has been several reports on TFS databases growing too fast and growing too big.  Notable this has been observed when one has started to use more features of the Testing system.  Also, the TFS 2010 handles test results differently from TFS 2008, and this leads to more data stored in the TFS databases. As a consequence of this there has been released some tools to remove unneeded data in the database, and also some fixes to correct for bugs which has been found and corrected during this process.  Further some preventive practices and maintenance rules should be adopted. A lot of people have blogged about this, among these are: Anu’s very important blog post here describes both the problem and solutions to handle it.  She describes both the Test Attachment Cleaner tool, and also some QFE/CU releases to fix some underlying bugs which prevented the tool from being fully effective. Brian Harry’s blog post here describes the problem too This forum thread describes the problem with some solution hints. Ravi Shanker’s blog post here describes best practices on solving this (TBP) Grant Holidays blogpost here describes strategies to use the Test Attachment Cleaner both to detect space problems and how to rectify them.   The problem can be divided into the following areas: Publishing of test results from builds Publishing of manual test results and their attachments in particular Publishing of deployment binaries for use during a test run Bugs in SQL server preventing total cleanup of data (All the published data above is published into the TFS database as attachments.) The test results will include all data being collected during the run.  Some of this data can grow rather large, like IntelliTrace logs and video recordings.   Also the pushing of binaries which happen for automated test runs, including tests run during a build using code coverage which will include all the files in the deployment folder, contributes a lot to the size of the attached data.   In order to handle this systematically, I have set up a 3-stage process: Find out if you have a database space issue Set up your TFS server to minimize potential database issues If you have the “problem”, clean up the database and otherwise keep it clean   Analyze the data Are your database( s) growing ?  Are unused test results growing out of proportion ? To find out about this you need to query your TFS database for some of the information, and use the Test Attachment Cleaner (TAC) to obtain some  more detailed information. If you don’t have too many databases you can use the SQL Server reports from within the Management Studio to analyze the database and table sizes. Or, you can use a set of queries . I find queries often faster to use because I can tweak them the way I want them.  But be aware that these queries are non-documented and non-supported and may change when the product team wants to change them. If you have multiple Project Collections, find out which might have problems: (Disclaimer: The queries below work on TFS 2010. They will not work on Dev-11, since the table structure have been changed.  I will try to update them for Dev-11 when it is released.) Open a SQL Management Studio session onto the SQL Server where you have your TFS Databases. Use the query below to find the Project Collection databases and their sizes, in descending size order.  use master select DB_NAME(database_id) AS DBName, (size/128) SizeInMB FROM sys.master_files where type=0 and substring(db_name(database_id),1,4)='Tfs_' and DB_NAME(database_id)<>'Tfs_Configuration' order by size desc Doing this on one of our SQL servers gives the following results: It is pretty easy to see on which collection to start the work   Find out which tables are possibly too large Keep a special watch out for the Tfs_Attachment table. Use the script at the bottom of Grant’s blog to find the table sizes in descending size order. In our case we got this result: From Grant’s blog we learnt that the tbl_Content is in the Version Control category, so the major only big issue we have here is the tbl_AttachmentContent.   Find out which team projects have possibly too large attachments In order to use the TAC to find and eventually delete attachment data we need to find out which team projects have these attachments. The team project is a required parameter to the TAC. Use the following query to find this, replace the collection database name with whatever applies in your case:   use Tfs_DefaultCollection select p.projectname, sum(a.compressedlength)/1024/1024 as sizeInMB from dbo.tbl_Attachment as a inner join tbl_testrun as tr on a.testrunid=tr.testrunid inner join tbl_project as p on p.projectid=tr.projectid group by p.projectname order by sum(a.compressedlength) desc In our case we got this result (had to remove some names), out of more than 100 team projects accumulated over quite some years: As can be seen here it is pretty obvious the “Byggtjeneste – Projects” are the main team project to take care of, with the ones on lines 2-4 as the next ones.  Check which attachment types takes up the most space It can be nice to know which attachment types takes up the space, so run the following query: use Tfs_DefaultCollection select a.attachmenttype, sum(a.compressedlength)/1024/1024 as sizeInMB from dbo.tbl_Attachment as a inner join tbl_testrun as tr on a.testrunid=tr.testrunid inner join tbl_project as p on p.projectid=tr.projectid group by a.attachmenttype order by sum(a.compressedlength) desc We then got this result: From this it is pretty obvious that the problem here is the binary files, as also mentioned in Anu’s blog. Check which file types, by their extension, takes up the most space Run the following query use Tfs_DefaultCollection select SUBSTRING(filename,len(filename)-CHARINDEX('.',REVERSE(filename))+2,999)as Extension, sum(compressedlength)/1024 as SizeInKB from tbl_Attachment group by SUBSTRING(filename,len(filename)-CHARINDEX('.',REVERSE(filename))+2,999) order by sum(compressedlength) desc This gives a result like this:   Now you should have collected enough information to tell you what to do – if you got to do something, and some of the information you need in order to set up your TAC settings file, both for a cleanup and for scheduled maintenance later.    Get your TFS server and environment properly set up Even if you have got the problem or if have yet not got the problem, you should ensure the TFS server is set up so that the risk of getting into this problem is minimized.  To ensure this you should install the following set of updates and components. The assumption is that your TFS Server is at SP1 level. Install the QFE for KB2608743 – which also contains detailed instructions on its use, download from here. The QFE changes the default settings to not upload deployed binaries, which are used in automated test runs. Binaries will still be uploaded if: Code coverage is enabled in the test settings. You change the UploadDeploymentItem to true in the testsettings file. Be aware that this might be reset back to false by another user which haven't installed this QFE. The hotfix should be installed to The build servers (the build agents) The machine hosting the Test Controller Local development computers (Visual Studio) Local test computers (MTM) It is not required to install it to the TFS Server, test agents or the build controller – it has no effect on these programs. If you use the SQL Server 2008 R2 you should also install the CU 10 (or later).  This CU fixes a potential problem of hanging “ghost” files.  This seems to happen only in certain trigger situations, but to ensure it doesn’t bite you, it is better to make sure this CU is installed. There is no such CU for SQL Server 2008 pre-R2 Work around:  If you suspect hanging ghost files, they can be – with some mental effort, deduced from the ghost counters using the following SQL query: use master SELECT DB_NAME(database_id) as 'database',OBJECT_NAME(object_id) as 'objectname', index_type_desc,ghost_record_count,version_ghost_record_count,record_count,avg_record_size_in_bytes FROM sys.dm_db_index_physical_stats (DB_ID(N'<DatabaseName>'), OBJECT_ID(N'<TableName>'), NULL, NULL , 'DETAILED') The problem is a stalled ghost cleanup process.  Restarting the SQL server after having stopped all components that depends on it, like the TFS Server and SPS services – that is all applications that connect to the SQL server. Then restart the SQL server, and finally start up all dependent processes again.  (I would guess a complete server reboot would do the trick too.) After this the ghost cleanup process will run properly again. The fix will come in the next CU cycle for SQL Server R2 SP1.  The R2 pre-SP1 and R2 SP1 have separate maintenance cycles, and are maintained individually. Each have its own set of CU’s. When it comes I will add the link here to that CU. The "hanging ghost file” issue came up after one have run the TAC, and deleted enourmes amount of data.  The SQL Server can get into this hanging state (without the QFE) in certain cases due to this. And of course, install and set up the Test Attachment Cleaner command line power tool.  This should be done following some guidelines from Ravi Shanker: “When you run TAC, ensure that you are deleting small chunks of data at regular intervals (say run TAC every night at 3AM to delete data that is between age 730 to 731 days) – this will ensure that small amounts of data are being deleted and SQL ghosted record cleanup can catch up with the number of deletes performed. “ This rule minimizes the risk of the ghosted hang problem to occur, and further makes it easier for the SQL server ghosting process to work smoothly. “Run DBCC SHRINKDB post the ghosted records are cleaned up to physically reclaim the space on the file system” This is the last step in a 3 step process of removing SQL server data. First they are logically deleted. Then they are cleaned out by the ghosting process, and finally removed using the shrinkdb command. Cleaning out the attachments The TAC is run from the command line using a set of parameters and controlled by a settingsfile.  The parameters point out a server uri including the team project collection and also point at a specific team project. So in order to run this for multiple team projects regularly one has to set up a script to run the TAC multiple times, once for each team project.  When you install the TAC there is a very useful readme file in the same directory. When the deployment binaries are published to the TFS server, ALL items are published up from the deployment folder. That often means much more files than you would assume are necessary. This is a brute force technique. It works, but you need to take care when cleaning up. Grant has shown how their settings file looks in his blog post, removing all attachments older than 180 days , as long as there are no active workitems connected to them. This setting can be useful to clean out all items, both in a clean-up once operation, and in a general There are two scenarios we need to consider: Cleaning up an existing overgrown database Maintaining a server to avoid an overgrown database using scheduled TAC   1. Cleaning up a database which has grown too big due to these attachments. This job is a “Once” job.  We do this once and then move on to make sure it won’t happen again, by taking the actions in 2) below.  In this scenario you should only consider the large files. Your goal should be to simply reduce the size, and don’t bother about  the smaller stuff. That can be left a scheduled TAC cleanup ( 2 below). Here you can use a very general settings file, and just remove the large attachments, or you can choose to remove any old items.  Grant’s settings file is an example of the last one.  A settings file to remove only large attachments could look like this: <!-- Scenario : Remove large files --> <DeletionCriteria> <TestRun /> <Attachment> <SizeInMB GreaterThan="10" /> </Attachment> </DeletionCriteria> Or like this: If you want only to remove dll’s and pdb’s about that size, add an Extensions-section.  Without that section, all extensions will be deleted. <!-- Scenario : Remove large files of type dll's and pdb's --> <DeletionCriteria> <TestRun /> <Attachment> <SizeInMB GreaterThan="10" /> <Extensions> <Include value="dll" /> <Include value="pdb" /> </Extensions> </Attachment> </DeletionCriteria> Before you start up your scheduled maintenance, you should clear out all older items. 2. Scheduled maintenance using the TAC If you run a schedule every night, and remove old items, and also remove them in small batches.  It is important to run this often, like every night, in order to keep the number of deleted items low. That way the SQL ghost process works better. One approach could be to delete all items older than some number of days, let’s say 180 days. This could be combined with restricting it to keep attachments with active or resolved bugs.  Doing this every night ensures that only small amounts of data is deleted. <!-- Scenario : Remove old items except if they have active or resolved bugs --> <DeletionCriteria> <TestRun> <AgeInDays OlderThan="180" /> </TestRun> <Attachment /> <LinkedBugs> <Exclude state="Active" /> <Exclude state="Resolved"/> </LinkedBugs> </DeletionCriteria> In my experience there are projects which are left with active or resolved workitems, akthough no further work is done.  It can be wise to have a cleanup process with no restrictions on linked bugs at all. Note that you then have to remove the whole LinkedBugs section. A approach which could work better here is to do a two step approach, use the schedule above to with no LinkedBugs as a sweeper cleaning task taking away all data older than you could care about.  Then have another scheduled TAC task to take out more specifically attachments that you are not likely to use. This task could be much more specific, and based on your analysis clean out what you know is troublesome data. <!-- Scenario : Remove specific files early --> <DeletionCriteria> <TestRun > <AgeInDays OlderThan="30" /> </TestRun> <Attachment> <SizeInMB GreaterThan="10" /> <Extensions> <Include value="iTrace"/> <Include value="dll"/> <Include value="pdb"/> <Include value="wmv"/> </Extensions> </Attachment> <LinkedBugs> <Exclude state="Active" /> <Exclude state="Resolved" /> </LinkedBugs> </DeletionCriteria> The readme document for the TAC says that it recognizes “internal” extensions, but it does recognize any extension. To run the tool do the following command: tcmpt attachmentcleanup /collection:your_tfs_collection_url /teamproject:your_team_project /settingsfile:path_to_settingsfile /outputfile:%temp%/teamproject.tcmpt.log /mode:delete   Shrinking the database You could run a shrink database command after the TAC has run in cases where there are a lot of data being deleted.  In this case you SHOULD do it, to free up all that space.  But, after the shrink operation you should do a rebuild indexes, since the shrink operation will leave the database in a very fragmented state, which will reduce performance. Note that you need to rebuild indexes, reorganizing is not enough. For smaller amounts of data you should NOT shrink the database, since the data will be reused by the SQL server when it need to add more records.  In fact, it is regarded as a bad practice to shrink the database regularly.  So on a daily maintenance schedule you should NOT shrink the database. To shrink the database you do a DBCC SHRINKDATABASE command, and then follow up with a DBCC INDEXDEFRAG afterwards.  I find the easiest way to do this is to create a SQL Maintenance plan including the Shrink Database Task and the Rebuild Index Task and just execute it when you need to do this.

    Read the article

  • How Expedia Made My New Bride Cry

    - by Lance Robinson
    Tweet this? Email Expedia and ask them to give me and my new wife our honeymoon? When Expedia followed up their failure with our honeymoon trip with a complete and total lack of acknowledgement of any responsibility for the problem and endless loops of explaining the issue over and over again - I swore that they would make it right. When they brought my new bride to tears, I got an immediate and endless supply of motivation. I hope you will help me make them make it right by posting our story on Twitter, Facebook, your blog, on Expedia itself, and when talking to your friends in person about their own travel plans.   If you are considering using them now for an important trip - reconsider. Short summary: We arrived early for a flight - but Expedia had made a mistake with the data they supplied to JetBlue and Emirates, which resulted in us not being able to check in (one leg of our trip was missing)!  At the time of this post, three people (myself, my wife, and an exceptionally patient JetBlue employee named Mary) each spent hours on the phone with Expedia.  I myself spent right at 3 hours (according to iPhone records), Lauren spent an hour and a half or so, and poor Mary was probably on the phone for a good 3.5 hours.  This is after 5 hours total at the airport.  If you add up our phone time, that is nearly 8 hours of phone time over a 5 hour period with little or no help, stall tactics (?), run-around, denial, shifting of blame, and holding. Details below (times are approximate): First, my wife and I were married yesterday - June 18th, the 3 year anniversary of our first date. She is awesome. She is the nicest person I have ever known, a ton of fun, absolutely beautiful in every way. Ok enough mushy - here are the dirty details. 2:30 AM - Early Check-in Attempt - we attempted to check-in for our flight online. Some sort of technology error on website, instructed to checkin at desk. 4:30 AM - Arrive at airport. Try to check-in at kiosk, get the same error. We got to the JetBlue desk at RDU International Airport, where Mary helped us. Mary discovered that the Expedia provided itinerary does not match the Expedia provided tickets. We are informed that when that happens American, JetBlue, and others that use the same software cannot check you in for the flight because. Why? Because the itinerary was missing a leg of our flight! Basically we were not shown in the system as definitely being able to make it home. Mary called Expedia and was put on hold by their automated system. 4:55 AM - Mary, myself, and my brand new bride all waited for about 25 minutes when finally I decided I would make a call myself on my iPhone while Mary was on the airport phone. In their automated system, I chose "make a new reservation", thinking they might answer a little more quickly than "customer service". Not surprisingly I was connected to an Expedia person within 1 minute. They informed me that they would have to forward me to a customer service specialist. I explained to them that we were already on hold for that and had been for nearly half an hour, that we were going on our honeymoon and that our flight would be leaving soon - could they please help us. "Yes, I will help you". I hand the phone to JetBlue Mary who explains the situation 3 or 4 times. Obviously I couldn't hear both ends of the conversation at this point, but the Expedia person explained what the problem was by stating exactly what Mary had just spent 15 minutes explaining. Mary calmly confirms that this is the problem, and asks Expedia to re-issue the itinerary. Expedia tells Mary that they'll have to transfer her to customer service. Mary asks for someone specific so that we get an answer this time, and goes on hold. Mary get's connected, explains the situation, and then Mary's connection gets terminated. 5:10 AM - Mary calls back to the Expedia automated system again, and we wait for about 5 minutes on hold this time before I pick up my iPhone and call Expedia again myself. Again I go to sales, a person picks up the phone in less than a minute. I explain the situation and let them know that we are now very close to missing our flight for our honeymoon, could they please help us. "Yes, I will help you". Again I give the phone to Mary who provides them with a call back number in case we get disconnected again and explains the situation again. More back and forth with Expedia doing nothing but repeating the same questions, Mary answering the questions with the same information she provided in the original explanation, and Expedia simply restating the problem. Mary again asks them to re-issue the itinerary, and explains that doing so will fix the problem. Expedia again repeats the problem instead of fixing it, and Mary's connection gets terminated. 5:20 AM - Mary again calls back to Expedia. My beautiful bride also calls on her own phone. At this point she is struggling to hold back her tears, stumbling through an explanation of all that has happened and that we are about to miss our flight. Please help us. "Yes, I will help". My beautiful bride's connection gets terminated. Ok, maybe this disconnection isn't an accident. We've now been disconnected 3 times on two different phones. 5:45 AM - I walk away and pleadingly beg a person to help me. They "escalate" the issue to "Rosy" (sp?) at Expedia. I go through the whole song and dance again with Rosy, who gives me the same treatment Mary was given. Rosy blames JetBlue for now having the correct data. Meanwhile Mary is on the phone with Emirates Air (the airline for the second leg of our trip), who agrees with JetBlue that Expedia's data isn't up to date. We are informed by two airport employees that issues like this with Expedia are not uncommon, and that the fix is simple. On the phone iwth Rosy, I ask her to re-issue the itinerary because we are about to miss our flight. She again explains the problem to me. At this point, I am standing at the window, pleading with Rosy to help us get to our honeymoon, watching our airplane. Then our airplane leaves without us. 6:03 AM - At this point we have missed our flight. Re-issuing the itinerary is no longer a solution. I ask Rosy to start from the beginning and work us up a new trip. She says that she cannot do that. She says that she needs to talk to JetBlue and Emirates and find out why we cannot check-in for our flight. I remind Rosy that our flight has already left - I just watched it taxi away - it no longer matters why (not to mention the fact that we already knew why, and have known why since 4:30 AM), and have known the solution since 4:30 AM. Rosy, can you please book a new trip? Yes, but it will cost $400. Excuse me? Now you can, but it will cost ME to fix your mistake? Rosy says that she can escalate the situation to her supervisor but that will take 1.5 hours. 6:15 AM - I told Rosy that if they had re-issued the itinerary as JetBlue asked (at 4:30 AM), my new wife and I might be on the airplane now instead of dealing with this on the phone and missing the beginning (and how much more?) of our honeymoon. Rosy said that it was not necessary to re-issue the itinerary. Out of curiosity, i asked Rosy if there was some financial burden on them to re-issue the itinerary. "No", said Rosy. I asked her if it was a large time burden on Expedia to re-issue the itinerary. "No", said Rosy. I directly asked Rosy: Why wouldn't Expedia have re-issued the itinerary when JetBlue asked? No answer. I asked Rosy: If you had re-issued the itinerary at 4:30, isn't it possible that I would be on that flight right now? She actually surprised me by answering "Yes" to that question. So I pointed out that it followed that Expedia was responsible for the fact that we missed out flight, and she immediately went into more about how the problem was with JetBlue - but now it was ALSO an Emirates Air problem as well. I tell Rosy to go ahead and escalate the issue again, and please call me back in that 1.5 hours (which how is about 1 hour and 10 minutes away). 6:30 AM - I start tweeting my frustration with iPhone. It's now pretty much impossible for us to make it to The Maldives by 3pm, which is the time at which we would need to arrive in order to be allowed service to the actual island where we are staying. Expedia has now given me the run-around for 2 hours, caused me to miss my flight, and worst of all caused my amazing new wife Lauren to miss our honeymoon. You think I was mad? No. Furious. Its ok to make mistakes - but to refuse to fix them and to ruin our honeymoon? No, not ok, Expedia. I swore right then that Expedia would make this right. 7:45 AM - JetBlue mary is still talking her tail off to other people in JetBlue and Emirates Air. Mary works it out so that if Expedia simply books a new trip, JetBlue and Emirates will both waive all the fees. Now we just have to convince Expedia to fix their mistake and get us on our way! Around this time Expedia Rosy calls me back! I inform her of the excellent work of JetBlue Mary - that JetBlue and Emirates both will waive the fees so Expedia can fix their mistake and get us going on our way. She says that she sees documentation of this in her system and that she needs to put me on hold "for 1 to 10 minutes" to talk to Emirates Air (why I'm not exactly sure). I say ok. 8:45 AM - After an hour on hold, Rosy comes on the line and asks me to hold more. I ask her to call me back. 9:35 AM - I put down the iPhone Twitter app and picks up the laptop. You think I made some noise with my iPhone? Heh 11:25 AM - Expedia follows me and sends a canned "We're sorry, DM us the details".  If you look at their Twitter feed, 16 out of the most recent 20 tweets are exactly the same canned response.  The other 4?  Ads.  Um - #MultiFAIL? To Expedia:  You now have had (as explained above) 8 hours of 3 different people explaining our situation, you know the email address of our Expedia account, you know my web blog, you know my Twitter address, you know my phone number.  You also know how upset you have made both me and my new bride by treating us with such a ... non caring, scripted, uncooperative, argumentative, and possibly even deceitful manner.  In the wise words of the great Kenan Thompson of SNL: "FIX IT!".  And no, I'm NOT going away until you make this right. Period. 11:45 AM - Expedia corporate office called.  The woman I spoke to was very nice and apologetic.  She listened to me tell the story again, she says she understands the problem and she is going to work to resolve it.  I don't have any details on what exactly that resolution might me, she said she will call me back in 20 minutes.  She found out about the problem via Twitter.  Thank you Twitter, and all of you who helped.  Hopefully social media will win my wife and I our honeymoon, and hopefully Expedia will encourage their customer service teams treat their customers properly. 12:22 PM - Spoke to Fran again from Expedia corporate office.  She has a flight for us tonight.  She is booking it now.  We will arrive at our honeymoon destination of beautiful Veligandu Island Resort only 1 day late.  She cannot confirm today, but she expects that Expedia will pay for the lost honeymoon night.  Thank you everyone for your help.  I will reflect more on this whole situation and confirm its resolution after our flight is 100% confirmed.  For now, I'm going to take a breather and go kiss my wonderful wife! 1:50 PM - Have not yet received the promised phone call.  We did receive an email with a new itinerary for a flight but the booking is not for specific seats, so there is no guarantee that my wife and I will be able to sit together.  With the original booking I carefully selected our seats for every segment of our trip.  I decided to call into the phone number that Fran from the Expedia corporate office gave me.  Its automated voice system identified itself as "Tier 3 Support".  I am currently still on hold with them, I have not gotten through to a human yet. 1:55 PM - Fran from Expedia called me back.  She confirmed us as booked.  She called the airlines to confirm.  Unfortunately, Expedia was unwilling or unable to allow us any type of seat selection.  It is possible that i won't get to sit next to the woman I married less than a day ago on our 40 total hours of flight time (there and back).  In addition, our seats could be the worst seats on the planes, with no reclining seat back or right next to the restroom.  Despite this fact (which in my opinion is huge), the horrible inconvenience, the hours at the airport, and the negative Internet publicity that Expedia is receiving, Expedia declined to offer us any kind of upgrade or to mark us as SFU (suitable for upgrade).  Since they didn't offer - I asked, and was rejected.  I am grateful to finally be heading in the right direction, but not only did Expedia horribly botch this job from the very beginning, they followed that botch job with near zero customer service, followed by a verbally apologetic but otherwise half-hearted resolution.  If this works out favorably for us, great.  If not - I'm not done making noise, Expedia.  You owe us, and I expect you to make it right.  You haven't quite done that yet. Thanks - Thank you to Twitter.  Thanks to all those who sympathize with us and helped us get the attention of Expedia, since three people (one of them an airline employee) using Expedia's normal channels of communication for many hours didn't help.  Thanks especially to my PowerShell and Sharepoint friends, my local friends, and those connectors who encouraged me and spread my story. 5:15 PM - Love Wins - After all this, Lauren and I are exhausted.  We both took a short nap, and when we woke up we talked about the last 24 hours.  It was a big, amazing, story-filled 24 hours.  I said that Expedia won, but Lauren said no.  She pointed out how lucky we are.  We are in love and married.  We have wonderful family and friends.  We are both hard-working successful people who love what they do.  We get to go to an amazing exotic destination for our honeymoon like Veligandu in The Maldives...  That's a lot of good.  Expedia didn't win.  This was (is) a big loss for Expedia.  It is a public blemish for all to see.  But Lauren and I did win, big time.  Expedia may not have made things right - but things are right for us.  Post in progress... I will relay any further comments (or lack of) from Expedia soon, as well as an update on confirmation of their repayment of our lost resort room rates.  I'll also post a picture of us on our honeymoon as soon as I can!

    Read the article

  • ASP.NET and HTML5 Local Storage

    - by Stephen Walther
    My favorite feature of HTML5, hands-down, is HTML5 local storage (aka DOM storage). By taking advantage of HTML5 local storage, you can dramatically improve the performance of your data-driven ASP.NET applications by caching data in the browser persistently. Think of HTML5 local storage like browser cookies, but much better. Like cookies, local storage is persistent. When you add something to browser local storage, it remains there when the user returns to the website (possibly days or months later). Importantly, unlike the cookie storage limitation of 4KB, you can store up to 10 megabytes in HTML5 local storage. Because HTML5 local storage works with the latest versions of all modern browsers (IE, Firefox, Chrome, Safari), you can start taking advantage of this HTML5 feature in your applications right now. Why use HTML5 Local Storage? I use HTML5 Local Storage in the JavaScript Reference application: http://Superexpert.com/JavaScriptReference The JavaScript Reference application is an HTML5 app that provides an interactive reference for all of the syntax elements of JavaScript (You can read more about the application and download the source code for the application here). When you open the application for the first time, all of the entries are transferred from the server to the browser (all 300+ entries). All of the entries are stored in local storage. When you open the application in the future, only changes are transferred from the server to the browser. The benefit of this approach is that the application performs extremely fast. When you click the details link to view details on a particular entry, the entry details appear instantly because all of the entries are stored on the client machine. When you perform key-up searches, by typing in the filter textbox, matching entries are displayed very quickly because the entries are being filtered on the local machine. This approach can have a dramatic effect on the performance of any interactive data-driven web application. Interacting with data on the client is almost always faster than interacting with the same data on the server. Retrieving Data from the Server In the JavaScript Reference application, I use Microsoft WCF Data Services to expose data to the browser. WCF Data Services generates a REST interface for your data automatically. Here are the steps: Create your database tables in Microsoft SQL Server. For example, I created a database named ReferenceDB and a database table named Entities. Use the Entity Framework to generate your data model. For example, I used the Entity Framework to generate a class named ReferenceDBEntities and a class named Entities. Expose your data through WCF Data Services. I added a WCF Data Service to my project and modified the data service class to look like this:   using System.Data.Services; using System.Data.Services.Common; using System.Web; using JavaScriptReference.Models; namespace JavaScriptReference.Services { [System.ServiceModel.ServiceBehavior(IncludeExceptionDetailInFaults = true)] public class EntryService : DataService<ReferenceDBEntities> { // This method is called only once to initialize service-wide policies. public static void InitializeService(DataServiceConfiguration config) { config.UseVerboseErrors = true; config.SetEntitySetAccessRule("*", EntitySetRights.All); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V2; } // Define a change interceptor for the Products entity set. [ChangeInterceptor("Entries")] public void OnChangeEntries(Entry entry, UpdateOperations operations) { if (!HttpContext.Current.Request.IsAuthenticated) { throw new DataServiceException("Cannot update reference unless authenticated."); } } } }     The WCF data service is named EntryService. Notice that it derives from DataService<ReferenceEntitites>. Because it derives from DataService<ReferenceEntities>, the data service exposes the contents of the ReferenceEntitiesDB database. In the code above, I defined a ChangeInterceptor to prevent un-authenticated users from making changes to the database. Anyone can retrieve data through the service, but only authenticated users are allowed to make changes. After you expose data through a WCF Data Service, you can use jQuery to retrieve the data by performing an Ajax call. For example, I am using an Ajax call that looks something like this to retrieve the JavaScript entries from the EntryService.svc data service: $.ajax({ dataType: "json", url: “/Services/EntryService.svc/Entries”, success: function (result) { var data = callback(result["d"]); } });     Notice that you must unwrap the data using result[“d”]. After you unwrap the data, you have a JavaScript array of the entries. I’m transferring all 300+ entries from the server to the client when the application is opened for the first time. In other words, I transfer the entire database from the server to the client, once and only once, when the application is opened for the first time. The data is transferred using JSON. Here is a fragment: { "d" : [ { "__metadata": { "uri": "http://superexpert.com/javascriptreference/Services/EntryService.svc/Entries(1)", "type": "ReferenceDBModel.Entry" }, "Id": 1, "Name": "Global", "Browsers": "ff3_6,ie8,ie9,c8,sf5,es3,es5", "Syntax": "object", "ShortDescription": "Contains global variables and functions", "FullDescription": "<p>\nThe Global object is determined by the host environment. In web browsers, the Global object is the same as the windows object.\n</p>\n<p>\nYou can use the keyword <code>this</code> to refer to the Global object when in the global context (outside of any function).\n</p>\n<p>\nThe Global object holds all global variables and functions. For example, the following code demonstrates that the global <code>movieTitle</code> variable refers to the same thing as <code>window.movieTitle</code> and <code>this.movieTitle</code>.\n</p>\n<pre>\nvar movieTitle = \"Star Wars\";\nconsole.log(movieTitle === this.movieTitle); // true\nconsole.log(movieTitle === window.movieTitle); // true\n</pre>\n", "LastUpdated": "634298578273756641", "IsDeleted": false, "OwnerId": null }, { "__metadata": { "uri": "http://superexpert.com/javascriptreference/Services/EntryService.svc/Entries(2)", "type": "ReferenceDBModel.Entry" }, "Id": 2, "Name": "eval(string)", "Browsers": "ff3_6,ie8,ie9,c8,sf5,es3,es5", "Syntax": "function", "ShortDescription": "Evaluates and executes JavaScript code dynamically", "FullDescription": "<p>\nThe following code evaluates and executes the string \"3+5\" at runtime.\n</p>\n<pre>\nvar result = eval(\"3+5\");\nconsole.log(result); // returns 8\n</pre>\n<p>\nYou can rewrite the code above like this:\n</p>\n<pre>\nvar result;\neval(\"result = 3+5\");\nconsole.log(result);\n</pre>", "LastUpdated": "634298580913817644", "IsDeleted": false, "OwnerId": 1 } … ]} I worried about the amount of time that it would take to transfer the records. According to Google Chome, it takes about 5 seconds to retrieve all 300+ records on a broadband connection over the Internet. 5 seconds is a small price to pay to avoid performing any server fetches of the data in the future. And here are the estimated times using different types of connections using Fiddler: Notice that using a modem, it takes 33 seconds to download the database. 33 seconds is a significant chunk of time. So, I would not use the approach of transferring the entire database up front if you expect a significant portion of your website audience to connect to your website with a modem. Adding Data to HTML5 Local Storage After the JavaScript entries are retrieved from the server, the entries are stored in HTML5 local storage. Here’s the reference documentation for HTML5 storage for Internet Explorer: http://msdn.microsoft.com/en-us/library/cc197062(VS.85).aspx You access local storage by accessing the windows.localStorage object in JavaScript. This object contains key/value pairs. For example, you can use the following JavaScript code to add a new item to local storage: <script type="text/javascript"> window.localStorage.setItem("message", "Hello World!"); </script>   You can use the Google Chrome Storage tab in the Developer Tools (hit CTRL-SHIFT I in Chrome) to view items added to local storage: After you add an item to local storage, you can read it at any time in the future by using the window.localStorage.getItem() method: <script type="text/javascript"> window.localStorage.setItem("message", "Hello World!"); </script>   You only can add strings to local storage and not JavaScript objects such as arrays. Therefore, before adding a JavaScript object to local storage, you need to convert it into a JSON string. In the JavaScript Reference application, I use a wrapper around local storage that looks something like this: function Storage() { this.get = function (name) { return JSON.parse(window.localStorage.getItem(name)); }; this.set = function (name, value) { window.localStorage.setItem(name, JSON.stringify(value)); }; this.clear = function () { window.localStorage.clear(); }; }   If you use the wrapper above, then you can add arbitrary JavaScript objects to local storage like this: var store = new Storage(); // Add array to storage var products = [ {name:"Fish", price:2.33}, {name:"Bacon", price:1.33} ]; store.set("products", products); // Retrieve items from storage var products = store.get("products");   Modern browsers support the JSON object natively. If you need the script above to work with older browsers then you should download the JSON2.js library from: https://github.com/douglascrockford/JSON-js The JSON2 library will use the native JSON object if a browser already supports JSON. Merging Server Changes with Browser Local Storage When you first open the JavaScript Reference application, the entire database of JavaScript entries is transferred from the server to the browser. Two items are added to local storage: entries and entriesLastUpdated. The first item contains the entire entries database (a big JSON string of entries). The second item, a timestamp, represents the version of the entries. Whenever you open the JavaScript Reference in the future, the entriesLastUpdated timestamp is passed to the server. Only records that have been deleted, updated, or added since entriesLastUpdated are transferred to the browser. The OData query to get the latest updates looks like this: http://superexpert.com/javascriptreference/Services/EntryService.svc/Entries?$filter=(LastUpdated%20gt%20634301199890494792L) If you remove URL encoding, the query looks like this: http://superexpert.com/javascriptreference/Services/EntryService.svc/Entries?$filter=(LastUpdated gt 634301199890494792L) This query returns only those entries where the value of LastUpdated > 634301199890494792 (the version timestamp). The changes – new JavaScript entries, deleted entries, and updated entries – are merged with the existing entries in local storage. The JavaScript code for performing the merge is contained in the EntriesHelper.js file. The merge() method looks like this:   merge: function (oldEntries, newEntries) { // concat (this performs the add) oldEntries = oldEntries || []; var mergedEntries = oldEntries.concat(newEntries); // sort this.sortByIdThenLastUpdated(mergedEntries); // prune duplicates (this performs the update) mergedEntries = this.pruneDuplicates(mergedEntries); // delete mergedEntries = this.removeIsDeleted(mergedEntries); // Sort this.sortByName(mergedEntries); return mergedEntries; },   The contents of local storage are then updated with the merged entries. I spent several hours writing the merge() method (much longer than I expected). I found two resources to be extremely useful. First, I wrote extensive unit tests for the merge() method. I wrote the unit tests using server-side JavaScript. I describe this approach to writing unit tests in this blog entry. The unit tests are included in the JavaScript Reference source code. Second, I found the following blog entry to be super useful (thanks Nick!): http://nicksnettravels.builttoroam.com/post/2010/08/03/OData-Synchronization-with-WCF-Data-Services.aspx One big challenge that I encountered involved timestamps. I originally tried to store an actual UTC time as the value of the entriesLastUpdated item. I quickly discovered that trying to work with dates in JSON turned out to be a big can of worms that I did not want to open. Next, I tried to use a SQL timestamp column. However, I learned that OData cannot handle the timestamp data type when doing a filter query. Therefore, I ended up using a bigint column in SQL and manually creating the value when a record is updated. I overrode the SaveChanges() method to look something like this: public override int SaveChanges(SaveOptions options) { var changes = this.ObjectStateManager.GetObjectStateEntries( EntityState.Modified | EntityState.Added | EntityState.Deleted); foreach (var change in changes) { var entity = change.Entity as IEntityTracking; if (entity != null) { entity.LastUpdated = DateTime.Now.Ticks; } } return base.SaveChanges(options); }   Notice that I assign Date.Now.Ticks to the entity.LastUpdated property whenever an entry is modified, added, or deleted. Summary After building the JavaScript Reference application, I am convinced that HTML5 local storage can have a dramatic impact on the performance of any data-driven web application. If you are building a web application that involves extensive interaction with data then I recommend that you take advantage of this new feature included in the HTML5 standard.

    Read the article

  • What's up with OCFS2?

    - by wcoekaer
    On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

    Read the article

  • SQLAuthority News – Job Interviewing the Right Way (and for the Right Reasons) – Guest Post by Feodor Georgiev

    - by pinaldave
    Feodor Georgiev is a SQL Server database specialist with extensive experience of thinking both within and outside the box. He has wide experience of different systems and solutions in the fields of architecture, scalability, performance, etc. Feodor has experience with SQL Server 2000 and later versions, and is certified in SQL Server 2008. Feodor has written excellent article on Job Interviewing the Right Way. Here is his article in his own language. A while back I was thinking to start a blog post series on interviewing and employing IT personnel. At that time I had just read the ‘Smart and gets things done’ book (http://www.joelonsoftware.com/items/2007/06/05.html) and I was hyped up on some debatable topics regarding finding and employing the best people in the branch. I have no problem with hiring the best of the best; it’s just the definition of ‘the best of the best’ that makes things a bit more complicated. One of the fundamental books one can read on the topic of interviewing is the one mentioned above. If you have not read it, then you must do so; not because it contains the ultimate truth, and not because it gives the answers to most questions on the subject, but because the book contains an extensive set of questions about interviewing and employing people. Of course, a big part of these questions have different answers, depending on location, culture, available funds and so on. (What works in the US may not necessarily work in the Nordic countries or India, or it may work in a different way). The only thing that is valid regardless of any external factor is this: curiosity. In my belief there are two kinds of people – curious and not-so-curious; regardless of profession. Think about it – professional success is directly proportional to the individual’s curiosity + time of active experience in the field. (I say ‘active experience’ because vacations and any distractions do not count as experience :)  ) So, curiosity is the factor which will distinguish a good employee from the not-so-good one. But let’s shift our attention to something else for now: a few tips and tricks for successful interviews. Tip and trick #1: get your priorities straight. Your status usually dictates your priorities; for example, if the person looking for a job has just relocated to a new country, they might tend to ignore some of their priorities and overload others. In other words, setting priorities straight means to define the personal criteria by which the interview process is lead. For example, similar to the following questions can help define the criteria for someone looking for a job: How badly do I need a (any) job? Is it more important to work in a clean and quiet environment or is it important to get paid well (or both, if possible)? And so on… Furthermore, before going to the interview, the candidate should have a list of priorities, sorted by the most importance: e.g. I want a quiet environment, x amount of money, great helping boss, a desk next to a window and so on. Also it is a good idea to be prepared and know which factors can be compromised and to what extent. Tip and trick #2: the interview is a two-way street. A job candidate should not forget that the interview process is not a one-way street. What I mean by this is that while the employer is interviewing the potential candidate, the job seeker should not miss the chance to interview the employer. Usually, the employer and the candidate will meet for an interview and talk about a variety of topics. In a quality interview the candidate will be presented to key members of the team and will have the opportunity to ask them questions. By asking the right questions both parties will define their opinion about each other. For example, if the candidate talks to one of the potential bosses during the interview process and they notice that the potential manager has a hard time formulating a question, then it is up to the candidate to decide whether working with such person is a red flag for them. There are as many interview processes out there as there are companies and each one is different. Some bigger companies and corporates can afford pre-selection processes, 3 or even 4 stages of interviews, small companies usually settle with one interview. Some companies even give cognitive tests on the interview. Why not? In his book Joel suggests that a good candidate should be pampered and spoiled beyond belief with a week-long vacation in New York, fancy hotels, food and who knows what. For all I can imagine, an interview might even take place at the top of the Eifel tower (right, Mr. Joel, right?) I doubt, however, that this is the optimal way to capture the attention of a good employee. The ‘curiosity’ topic What I have learned so far in my professional experience is that opinions can be subjective. Plus, opinions on technology subjects can also be subjective. According to Joel, only hiring the best of the best is worth it. If you ask me, there is no such thing as best of the best, simply because human nature (well, aside from some physical limitations, like putting your pants on through your head :) ) has no boundaries. And why would it have boundaries? I have seen many curious and interesting people, naturally good at technology, though uninterested in it as one  can possibly be; I have also seen plenty of people interested in technology, who (in an ideal world) should have stayed far from it. At any rate, all of this sums up at the end to the ‘supply and demand’ factor. The interview process big-bang boils down to this: If there is a mutual benefit for both the employer and the potential employee to work together, then it all sorts out nicely. If there is no benefit, then it is much harder to get to a common place. Tip and trick #3: word-of-mouth is worth a thousand words Here I would just mention that the best thing a job candidate can get during the interview process is access to future team members or other employees of the new company. Nowadays the world has become quite small and everyone knows everyone. Look at LinkedIn, look at other professional networks and you will realize how small the world really is. Knowing people is a good way to become more approachable and to approach them. Tip and trick #4: Be confident. It is true that for some people confidence is as natural as breathing and others have to work hard to express it. Confidence is, however, a key factor in convincing the other side (potential employer or employee) that there is a great chance for success by working together. But it cannot get you very far if it’s not backed up by talent, curiosity and knowledge. Tip and trick #5: The right reasons What really bothers me in Sweden (and I am sure that there are similar situations in other countries) is that there is a tendency to fill quotas and to filter out candidates by criteria different from their skill and knowledge. In job ads I see quite often the phrases ‘positive thinker’, ‘team player’ and many similar hints about personality features. So my guess here is that discrimination has evolved to a new level. Let me clear up the definition of discrimination: ‘unfair treatment of a person or group on the basis of prejudice’. And prejudice is the ‘partiality that prevents objective consideration of an issue or situation’. In other words, there is not much difference whether a job candidate is filtered out by race, gender or by personality features – it is all a bad habit. And in reality, there is no proven correlation between the technology knowledge paired with skills and the personal features (gender, race, age, optimism). It is true that a significantly greater number of Darwin awards were given to men than to women, but I am sure that somewhere there is a paper or theory explaining the genetics behind this. J This topic actually brings to mind one of my favorite work related stories. A while back I was working for a big company with many teams involved in their processes. One of the teams was occupying 2 rooms – one had the team members and was full of light, colorful posters, chit-chats and giggles, whereas the other room was dark, lighted only by a single monitor with a quiet person in front of it. Later on I realized that the ‘dark room’ person was the guru and the ultimate problem-solving-brain who did not like the chats and giggles and hence was in a separate room. In reality, all severe problems which the chatty and cheerful team members could not solve and all emergencies were directed to ‘the dark room’. And thus all worked out well. The moral of the story: Personality has nothing to do with technology knowledge and skills. End of story. Summary: I’d like to stress the fact that there is no ultimately perfect candidate for a job, and there is no such thing as ‘best-of-the-best’. From my personal experience, the main criteria by which I measure people (co-workers and bosses) is the curiosity factor; I know from experience that the more curious and inventive a person is, the better chances there are for great achievements in their field. Related stories: (for extra credit) 1) Get your priorities straight. A while back as a consultant I was working for a few days at a time at different offices and for different clients, and so I was able to compare and analyze the work environments. There were two different places which I compared and recently I asked a friend of mine the following question: “Which one would you prefer as a work environment: a noisy office full of people, or a quiet office full of faulty smells because the office is rarely cleaned?” My friend was puzzled for a while, thought about it and said: “Hmm, you are talking about two different kinds of pollution… I will probably choose the second, since I can clean the workplace myself a bit…” 2) The interview is a two-way street. One time, during a job interview, I met a potential boss that had a hard time phrasing a question. At that particular time it was clear to me that I would not have liked to work under this person. According to my work religion, the properly asked question contains at least half of the answer. And if I work with someone who cannot ask a question… then I’d be doing double or triple work. At another interview, after the technical part with the team leader of the department, I was introduced to one of the team members and we were left alone for 5 minutes. I immediately jumped on the occasion and asked the blunt question: ‘What have you learned here for the past year and how do you like your job?’ The team member looked at me and said ‘Nothing really. I like playing with my cats at home, so I am out of here at 5pm and I don’t have time for much.’ I was disappointed at the time and I did not take the job offer. I wasn’t that shocked a few months later when the company went bankrupt. 3) The right reasons to take a job: personality check. A while back I was asked to serve as a job reference for a coworker. I agreed, and after some weeks I got a phone call from the company where my colleague was applying for a job. The conversation started with the manager’s question about my colleague’s personality and about their social skills. (You can probably guess what my internal reaction was… J ) So, after 30 minutes of pouring common sense into the interviewer’s head, we finally agreed on the fact that a shy or quiet personality has nothing to do with work skills and knowledge. Some years down the road my former colleague is taking the manager’s position as the manager is demoted to a different department. Reference: Feodor Georgiev, Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • 64-bit Archives Needed

    - by user9154181
    A little over a year ago, we received a question from someone who was trying to build software on Solaris. He was getting errors from the ar command when creating an archive. At that time, the ar command on Solaris was a 32-bit command. There was more than 2GB of data, and the ar command was hitting the file size limit for a 32-bit process that doesn't use the largefile APIs. Even in 2011, 2GB is a very large amount of code, so we had not heard this one before. Most of our toolchain was extended to handle 64-bit sized data back in the 1990's, but archives were not changed, presumably because there was no perceived need for it. Since then of course, programs have continued to get larger, and in 2010, the time had finally come to investigate the issue and find a way to provide for larger archives. As part of that process, I had to do a deep dive into the archive format, and also do some Unix archeology. I'm going to record what I learned here, to document what Solaris does, and in the hope that it might help someone else trying to solve the same problem for their platform. Archive Format Details Archives are hardly cutting edge technology. They are still used of course, but their basic form hasn't changed in decades. Other than to fix a bug, which is rare, we don't tend to touch that code much. The archive file format is described in /usr/include/ar.h, and I won't repeat the details here. Instead, here is a rough overview of the archive file format, implemented by System V Release 4 (SVR4) Unix systems such as Solaris: Every archive starts with a "magic number". This is a sequence of 8 characters: "!<arch>\n". The magic number is followed by 1 or more members. A member starts with a fixed header, defined by the ar_hdr structure in/usr/include/ar.h. Immediately following the header comes the data for the member. Members must be padded at the end with newline characters so that they have even length. The requirement to pad members to an even length is a dead giveaway as to the age of the archive format. It tells you that this format dates from the 1970's, and more specifically from the era of 16-bit systems such as the PDP-11 that Unix was originally developed on. A 32-bit system would have required 4 bytes, and 64-bit systems such as we use today would probably have required 8 bytes. 2 byte alignment is a poor choice for ELF object archive members. 32-bit objects require 4 byte alignment, and 64-bit objects require 64-bit alignment. The link-editor uses mmap() to process archives, and if the members have the wrong alignment, we have to slide (copy) them to the correct alignment before we can access the ELF data structures inside. The archive format requires 2 byte padding, but it doesn't prohibit more. The Solaris ar command takes advantage of this, and pads ELF object members to 8 byte boundaries. Anything else is padded to 2 as required by the format. The archive header (ar_hdr) represents all numeric values using an ASCII text representation rather than as binary integers. This means that an archive that contains only text members can be viewed using tools such as cat, more, or a text editor. The original designers of this format clearly thought that archives would be used for many file types, and not just for objects. Things didn't turn out that way of course — nearly all archives contain relocatable objects for a single operating system and machine, and are used primarily as input to the link-editor (ld). Archives can have special members that are created by the ar command rather than being supplied by the user. These special members are all distinguished by having a name that starts with the slash (/) character. This is an unambiguous marker that says that the user could not have supplied it. The reason for this is that regular archive members are given the plain name of the file that was inserted to create them, and any path components are stripped off. Slash is the delimiter character used by Unix to separate path components, and as such cannot occur within a plain file name. The ar command hides the special members from you when you list the contents of an archive, so most users don't know that they exist. There are only two possible special members: A symbol table that maps ELF symbols to the object archive member that provides it, and a string table used to hold member names that exceed 15 characters. The '/' convention for tagging special members provides room for adding more such members should the need arise. As I will discuss below, we took advantage of this fact to add an alternate 64-bit symbol table special member which is used in archives that are larger than 4GB. When an archive contains ELF object members, the ar command builds a special archive member known as the symbol table that maps all ELF symbols in the object to the archive member that provides it. The link-editor uses this symbol table to determine which symbols are provided by the objects in that archive. If an archive has a symbol table, it will always be the first member in the archive, immediately following the magic number. Unlike member headers, symbol tables do use binary integers to represent offsets. These integers are always stored in big-endian format, even on a little endian host such as x86. The archive header (ar_hdr) provides 15 characters for representing the member name. If any member has a name that is longer than this, then the real name is written into a special archive member called the string table, and the member's name field instead contains a slash (/) character followed by a decimal representation of the offset of the real name within the string table. The string table is required to precede all normal archive members, so it will be the second member if the archive contains a symbol table, and the first member otherwise. The archive format is not designed to make finding a given member easy. Such operations move through the archive from front to back examining each member in turn, and run in O(n) time. This would be bad if archives were commonly used in that manner, but in general, they are not. Typically, the ar command is used to build an new archive from scratch, inserting all the objects in one operation, and then the link-editor accesses the members in the archive in constant time by using the offsets provided by the symbol table. Both of these operations are reasonably efficient. However, listing the contents of a large archive with the ar command can be rather slow. Factors That Limit Solaris Archive Size As is often the case, there was more than one limiting factor preventing Solaris archives from growing beyond the 32-bit limits of 2GB (32-bit signed) and 4GB (32-bit unsigned). These limits are listed in the order they are hit as archive size grows, so the earlier ones mask those that follow. The original Solaris archive file format can handle sizes up to 4GB without issue. However, the ar command was delivered as a 32-bit executable that did not use the largefile APIs. As such, the ar command itself could not create a file larger than 2GB. One can solve this by building ar with the largefile APIs which would allow it to reach 4GB, but a simpler and better answer is to deliver a 64-bit ar, which has the ability to scale well past 4GB. Symbol table offsets are stored as 32-bit big-endian binary integers, which limits the maximum archive size to 4GB. To get around this limit requires a different symbol table format, or an extension mechanism to the current one, similar in nature to the way member names longer than 15 characters are handled in member headers. The size field in the archive member header (ar_hdr) is an ASCII string capable of representing a 32-bit unsigned value. This places a 4GB size limit on the size of any individual member in an archive. In considering format extensions to get past these limits, it is important to remember that very few archives will require the ability to scale past 4GB for many years. The old format, while no beauty, continues to be sufficient for its purpose. This argues for a backward compatible fix that allows newer versions of Solaris to produce archives that are compatible with older versions of the system unless the size of the archive exceeds 4GB. Archive Format Differences Among Unix Variants While considering how to extend Solaris archives to scale to 64-bits, I wanted to know how similar archives from other Unix systems are to those produced by Solaris, and whether they had already solved the 64-bit issue. I've successfully moved archives between different Unix systems before with good luck, so I knew that there was some commonality. If it turned out that there was already a viable defacto standard for 64-bit archives, it would obviously be better to adopt that rather than invent something new. The archive file format is not formally standardized. However, the ar command and archive format were part of the original Unix from Bell Labs. Other systems started with that format, extending it in various often incompatible ways, but usually with the same common shared core. Most of these systems use the same magic number to identify their archives, despite the fact that their archives are not always fully compatible with each other. It is often true that archives can be copied between different Unix variants, and if the member names are short enough, the ar command from one system can often read archives produced on another. In practice, it is rare to find an archive containing anything other than objects for a single operating system and machine type. Such an archive is only of use on the type of system that created it, and is only used on that system. This is probably why cross platform compatibility of archives between Unix variants has never been an issue. Otherwise, the use of the same magic number in archives with incompatible formats would be a problem. I was able to find information for a number of Unix variants, described below. These can be divided roughly into three tribes, SVR4 Unix, BSD Unix, and IBM AIX. Solaris is a SVR4 Unix, and its archives are completely compatible with those from the other members of that group (GNU/Linux, HP-UX, and SGI IRIX). AIX AIX is an exception to rule that Unix archive formats are all based on the original Bell labs Unix format. It appears that AIX supports 2 formats (small and big), both of which differ in fundamental ways from other Unix systems: These formats use a different magic number than the standard one used by Solaris and other Unix variants. They include support for removing archive members from a file without reallocating the file, marking dead areas as unused, and reusing them when new archive items are inserted. They have a special table of contents member (File Member Header) which lets you find out everything that's in the archive without having to actually traverse the entire file. Their symbol table members are quite similar to those from other systems though. Their member headers are doubly linked, containing offsets to both the previous and next members. Of the Unix systems described here, AIX has the only format I saw that will have reasonable insert/delete performance for really large archives. Everyone else has O(n) performance, and are going to be slow to use with large archives. BSD BSD has gone through 4 versions of archive format, which are described in their manpage. They use the same member header as SVR4, but their symbol table format is different, and their scheme for long member names puts the name directly after the member header rather than into a string table. GNU/Linux The GNU toolchain uses the SVR4 format, and is compatible with Solaris. HP-UX HP-UX seems to follow the SVR4 model, and is compatible with Solaris. IRIX IRIX has 32 and 64-bit archives. The 32-bit format is the standard SVR4 format, and is compatible with Solaris. The 64-bit format is the same, except that the symbol table uses 64-bit integers. IRIX assumes that an archive contains objects of a single ELFCLASS/MACHINE, and any archive containing ELFCLASS64 objects receives a 64-bit symbol table. Although they only use it for 64-bit objects, nothing in the archive format limits it to ELFCLASS64. It would be perfectly valid to produce a 64-bit symbol table in an archive containing 32-bit objects, text files, or anything else. Tru64 Unix (Digital/Compaq/HP) Tru64 Unix uses a format much like ours, but their symbol table is a hash table, making specific symbol lookup much faster. The Solaris link-editor uses archives by examining the entire symbol table looking for unsatisfied symbols for the link, and not by looking up individual symbols, so there would be no benefit to Solaris from such a hash table. The Tru64 ld must use a different approach in which the hash table pays off for them. Widening the existing SVR4 archive symbol tables rather than inventing something new is the simplest path forward. There is ample precedent for this approach in the ELF world. When ELF was extended to support 64-bit objects, the approach was largely to take the existing data structures, and define 64-bit versions of them. We called the old set ELF32, and the new set ELF64. My guess is that there was no need to widen the archive format at that time, but had there been, it seems obvious that this is how it would have been done. The Implementation of 64-bit Solaris Archives As mentioned earlier, there was no desire to improve the fundamental nature of archives. They have always had O(n) insert/delete behavior, and for the most part it hasn't mattered. AIX made efforts to improve this, but those efforts did not find widespread adoption. For the purposes of link-editing, which is essentially the only thing that archives are used for, the existing format is adequate, and issues of backward compatibility trump the desire to do something technically better. Widening the existing symbol table format to 64-bits is therefore the obvious way to proceed. For Solaris 11, I implemented that, and I also updated the ar command so that a 64-bit version is run by default. This eliminates the 2 most significant limits to archive size, leaving only the limit on an individual archive member. We only generate a 64-bit symbol table if the archive exceeds 4GB, or when the new -S option to the ar command is used. This maximizes backward compatibility, as an archive produced by Solaris 11 is highly likely to be less than 4GB in size, and will therefore employ the same format understood by older versions of the system. The main reason for the existence of the -S option is to allow us to test the 64-bit format without having to construct huge archives to do so. I don't believe it will find much use outside of that. Other than the new ability to create and use extremely large archives, this change is largely invisible to the end user. When reading an archive, the ar command will transparently accept either form of symbol table. Similarly, the ELF library (libelf) has been updated to understand either format. Users of libelf (such as the link-editor ld) do not need to be modified to use the new format, because these changes are encapsulated behind the existing functions provided by libelf. As mentioned above, this work did not lift the limit on the maximum size of an individual archive member. That limit remains fixed at 4GB for now. This is not because we think objects will never get that large, for the history of computing says otherwise. Rather, this is based on an estimation that single relocatable objects of that size will not appear for a decade or two. A lot can change in that time, and it is better not to overengineer things by writing code that will sit and rot for years without being used. It is not too soon however to have a plan for that eventuality. When the time comes when this limit needs to be lifted, I believe that there is a simple solution that is consistent with the existing format. The archive member header size field is an ASCII string, like the name, and as such, the overflow scheme used for long names can also be used to handle the size. The size string would be placed into the archive string table, and its offset in the string table would then be written into the archive header size field using the same format "/ddd" used for overflowed names.

    Read the article

< Previous Page | 75 76 77 78 79 80 81 82 83 84 85 86  | Next Page >