non obvious - Page 127 - Developer IT

How to replace all the blanks within square brackets with an underscore using sed?

- by Ringerrr

I figured out that in order to turn [some name] into [some_name] I need to use the following expression: s/$\[[^ ]*$ /\1_/ i.e. create a backreference capture for anything that starts with a literal '[' that contains any number of non space characters, followed by a space, to be replaced with the non space characters followed by an underscore. What I don't know yet though is how to alter this expression so it works for ALL underscores within the braces e.g. [a few words] into [a_few_words]. I sense that I'm close, but am just missing a chunk of knowledge that will unlock the key to making this thing work an infinite number of times within the constraints of the first set of []s contained in a line (of SQL Server DDL in this case). Any suggestions gratefully received....

Read the article

How can I resolve ASP.NET "~" app paths to the website root without a Control being present?

- by jdk

I want to Resolve "~/whatever" from inside non-Page contexts such as Global.asax (HttpApplication), HttpModule, HttpHandler, etc. but can only find such Resolution methods specific to Controls (and Page). I think the app should have enough knowledge to be able to map this outside the Page context. No? Or at least it makes sense to me it should be resolvable in other circumstances, wherever the app root is known. Update: The reason being I'm sticking "~" paths in the web.configuration files, and want to resolve them from the aforementioned non-Control scenarios. Update 2: I'm trying to resolve them to the website root such as Control.Resolve(..) URL behaviour, not to a file system path.

Read the article

Filtering out unique rows in MySQL

- by jpatokal

So I've got a large amount of SQL data that looks basically like this: user | src | dst 1 | 1 | 1 1 | 1 | 1 1 | 1 | 2 1 | 1 | 2 2 | 1 | 1 2 | 1 | 3 I want to filter out pairs of (src,dst) that are unique to one user (even if that user has duplicates), leaving behind only those pairs belonging to more than one user: user | src | dst 1 | 1 | 1 1 | 1 | 1 2 | 1 | 1 In other words, pair (1,2) is unique to user 1 and pair (1,3) to user 2, so they're dropped, leaving behind only all instances of pair (1,1). Any ideas? The answers to the question below can find the non-unique pairs, but my SQL-fu doesn't suffice to handle the complication of requiring that they belong to multiple users as well. [SQL question] How to select non "unique" rows

Read the article

Facebook connect access_token

- by Guillaume Santacruz

Hi i ve got the same issue than this guy: Acces token from facebook is not retrived for sign up? access_token (trying to get propert of non object ) Apparently he found a solution but I do not clearly understand it. Just need you help to understand what should i do. Problem is access_token trying to get property of a non object when i try to log in with facebook connect. the solution I don't understand is this one. "Its was an database error due to session have not created due to facebook app not live."

Read the article

Efficiency: what block size of kernel-mode memory allocations?

- by Robert

I need a big, driver-internal memory buffer with several tens of megabytes (non-paged, since accessed at dispatcher level). Since I think that allocating chunks of non-continuous memory will more likely succeed than allocating one single continuous memory block (especially when memory becomes fragmented) I want to implement that memory buffer as a linked list of memory blocks. What size should the blocks have to efficiently load the memory pages? (read: not to waste any page space) A multiple of 4096? (equally to the page size of the OS) A multiple of 4000? (not to waste another page for OS-internal memory allocation information) Another size? Target platform is Windows NT = 5.1 (XP and above) Target architectures are x86 and amd64 (not Itanium)

Read the article

How to print a variable in reversed byte order in Perl?

- by jth

Hi, I'am trying to convert the variable $num into its reverse byte order and print it out. This is what I have done so far: my $num=0x5514ddb7; my $s=pack('I!',$num); print "$s\n"; He prints it out as some non-printable characters and in a hex editor it looks right, but how can I get it readable on the console? Already tried print sprintf("%#x\n",$s); but he complains about an non-numeric argument, so I think pack returns a string. Any ideas how can I print out `0xb7dd1455 on the console, based on $num?

Read the article

why doesnt' nhibernate support this "exists in list" syntax ??

- by ooo

i have the following query and its failing in Nhibernate 3 LINQ witha a "Non supported" exception. Its similar to this question but this question was asked over a year ago so i am positive that the answer is out of date. My DB tables are: VacationRequest (id, personId) VacationRequestDate (id, vacationRequestId) Person (id, FirstName, LastName) My Entities are: VacationRequest (Person, IList) VacationRequestDate (VacationRequest, Date) Here is the query that is getting a "Non supported" Exception Session.Query<VacationRequestDate>() .Where(r => people .Contains(r.VacationRequest.Person, new PersonComparer())) .Fetch(r=>r.VacationRequest) .ToList(); is there a better way to write this that would be supported in Nhibernate? fyi . .the PersonComparer just compared person.Id

Read the article

Raising hard limit on RLIMIT_NOFILE system-wide on Linux

- by jonswar

We need to raise RLIMIT_NOFILE when running memcached, as we're hitting the default hard limit (1024). However, raising a hard limit requires root, and for various reasons we don't want to have to run memcached or its containing shell as root. Right now we happily run it as a non-root user. Is there a way to raise the hard limit for RLIMIT_NOFILE system-wide, so that we can continue to run memcached as non-root and simply raise the soft limit? This is RedHat Linux with 2.6 kernel. Thanks! Jon

Read the article

Getting the most recent entry per group in a select statement

- by TheObserver

I have 3 tables to join to get table1.code, table1.series, table2.entry_date, table3.title1 and I'm trying to get the most recent non null table3.title1 grouped by table1.code and table1.series. select table1.code, table1.series, max(table2.entry_date), table3.Title1 from table3 INNER JOIN table2 ON table3.ID = table2.ID INNER JOIN table1 ON table2.source_code = table1.code where table3.Title1 is not NULL group by table1.code, table1.series, table3.Title1 seems to give me all entries with a non null title1 instead of the most recent one. How should I structure the query to just pick the newest version of Title1 per code & series?

Read the article

Mdx produces repeated values for a measure and across measures

- by Joe

The MDX query below is giving me repeated measure values as shown in the result below the query. Sometimes it give me save valuea across different measures. SELECT NON EMPTY { [Measures].[Amount], } ON COLUMNS, NON EMPTY { ( [Date_Time].[Date].[Date].ALLMEMBERS * [Date_Time].[Working Day].[Working Day].ALLMEMBERS ) } DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME ON ROWS FROM [DDS] where {[Date_Time].[Year].&[2010-01-01T00:00:00] } Date working day Amount 2010-01-01 00:00:00.000 1 19582 2010-01-02 00:00:00.000 0 19582 2010-01-03 00:00:00.000 0 19582 2010-01-04 00:00:00.000 1 19582 2010-01-05 00:00:00.000 1 19582 2010-01-06 00:00:00.000 1 19582 2010-01-07 00:00:00.000 1 19582 How can I rectify these issues?

Read the article

What does "static" mean in the context of declaring global template functions?

- by smf68

I know what static means in the context of declaring global non-template functions (see e.g. What is a "static" function?), which is useful if you write a helper function in a header that is included from several different locations and want to avoid "duplicate definition" errors. So my question is: What does static mean in the context of declaring global template functions? Please note that I'm specifically asking about global, non-member template functions that do not belong to a class. In other words, what is the difference between the following two: template <typename T> void foo(T t) { /* implementation of foo here */ } template <typename T> static void bar(T t) { /* implementation of bar here */ }

Read the article

If I don't odr-use a variable, can I have multiple definitions of it across translation units?

- by sftrabbit

The standard seems to imply that there is no restriction on the number of definitions of a variable if it is not odr-used (§3.2/3): Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required. It does say that any variable can't be defined multiple times within a translation unit (§3.2/1): No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template. But I can't find a restriction for non-odr-used variables across the entire program. So why can't I compile something like the following: // other.cpp int x; // main.cpp int x; int main() {} Compiling and linking these files with g++ 4.6.3, I get a linker error for multiple definition of 'x'. To be honest, I expect this, but since x is not odr-used anywhere (as far as I can tell), I can't see how the standard restricts this. Or is it undefined behaviour?

Read the article

MPI_Bsend and MPI_Isend. How do they work ?

- by GBBL

Hi, using buffered send and non blocking send I was wondering how and if they implement a new level of parallelism in my application eventually generating a thread. Imagine that a slave process generates a large amount of data and want to send it to the master. My idea was to start a buffered or non blocking send then immediately begin to compute the next result. Just when I would have to send the new data I wold check if I can reuse the buffer. This would introduce a new level of parallelism in my application between CPU and communication. Does anybody knows how this is done in MPI ? Does MPI generate a new thread to handle the Bsend or Isend ? Thanks.

Read the article

Coalesce and Case-When with To_Date not working as expected (Postgres bug?)

- by ADTC

I'm using Postgres 9.1. The following query does not work as expected. Coalesce should return the first non-null value. However, this query returns null (1?) instead of the date (2). select COALESCE( TO_DATE('','yyyymmdd'), --(1) TO_DATE('20130201','yyyymmdd') --(2) ); --(1) this evaluates independently to null --(2) this evaluates independently to the date, and therefore is the first non-null value What am I doing wrong? Any workaround? Edit: This may have nothing to do with Coalesce at all. I tried some experiments with Case When constructs; it turns out, Postgres has this big ugly bug where it treats TO_DATE('','yyyymmdd') as not null, even though selecting it returns null.

Read the article

How to join by column name

- by Daniel Vaca

I have a table T1 such that gsdv |nsdv |esdv ------------------- 228.90 |216.41|0.00 and a table T2 such that ds |nm -------------------------- 'Non-Revenue Sales'|'ESDV' 'Gross Sales' |'GSDV' 'Net Sales' |'NSDV' How do I get the following table? ds |nm |val --------------------------------- 'Non-Revenue Sales'|'ESDV'|0.00 'Gross Sales' |'GSDV'|228.90 'Net Sales' |'NSDV'|216.41 I know that I can this by doing the following SELECT ds,nm,esdv val FROM T1,T2 WHERE nm = 'esdv' UNION SELECT ds,nm,gsdv val FROM T1,T2 WHERE nm = 'gsdv' UNION SELECT ds,nm,nsdv val FROM T1,T2 WHERE nm = 'nsdv' but I am looking for a more generic/nicer solution. I am using Sybase, but if you can think of a way to do this with other DBMS, please let me know. Thanks.

Read the article

Is "programmatically" a word? [closed]

- by Lo'oris

I can't find it on any of the online dictionaries I know: dict.org, word reference, urban dictionary, oxford paravia, garzanti. To my ears of a non-native speaker, it sounds horrible. Actually it sounds like a word made-up by another non-native speaker that wanted to say something, didn't know how, and just hacked in a word of his language. The only place I've read it other then user-created-content is the android documentation, so this might or might not be related. Do you happen to know where did it start to be used, why by did it spread so much, what does it really mean?

Read the article

Perl scraping script not recognising certain characters

- by user1849286

I have a script that works fine locally but on the server fails. It displays the non-breaking space symbol   as ? when printing to standard output. In the parsing of the page, if I try to get rid of non breaking space symbol with s/ //g nothing happens, neither getting rid of the question mark s/?//g It seems to stick no matter what. Bizzarely, this is not an issue when running the script locally. Additionally, question marks within a diamond symbol are inserted everywhere (on both the server script and the local script) instead of apostrophes, although at least that is not causing the parsing of the page to break on the local page. Confused, pls help.

Read the article

If item not in the lst in scheme

- by ms. sakura

I'm working on context-free grammars, and I have a function that returns the (terminal values of grammar) for example: i have non-terminal function that results in (A B) , from calling say ((A cat) (B happy np) (A B sad)) so technically A and B are non terminals of the grammar. Now I want to be able to get the terminals (cat happy np sad) (define terminals (lambda (lsts) (cond ((null? lsts) lsts) ((not(member? (car(car lsts)) (n-terminals lsts))) (cons (car(car lsts)) (terminals (car (cdr lsts))))) (else (terminals (cdr lsts)))))) PS: functionality of n-terminals is described above. member? is a boolean function that returns true if an item is a member of the list, false otherwise. My function returns an empty lst. What am I missing here?

Read the article

.NET prerequisites, 3.5sp1 but no 3.5? problem with 4.0?

- by acidzombie24

The title is confusing but the problem is not so much. I made a prerequisite with the default 3.5sp1 and windows installer 3.1. I ran it in my VM and to my surprise it asked me to install .NET. I checked the version and i have .NET 2 sp1, 3 sp1, 3.5, and two variants of 4.0 (client and extended beta). I looked in prerequisites and there doesnt seem to be an options for a non 3.5sp1. Is there some way i can select the non sp1? or compile so i dont need sp1? (it crashes upon startup but i am willing to bet i forgot a resource file)

Read the article

Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article

Do’s and Don’ts Building SharePoint Applications

- by Bil Simser

SharePoint is a great platform for building quick LOB applications. Simple things from employee time trackers to server and software inventory to full blown Help Desks can be crafted up using SharePoint from just customizing Lists. No programming necessary. However there are a few tricks I’ve painfully learned over the years that you can use for your own solutions. DO What’s In A Name? When you create a new list, column, or view you’ll commonly name it something like “Expense Reports”. However this has the ugly effect of creating a url to the list as “Expense%20Reports”. Or worse, an internal field name of “Expense_x0x0020_Reports” which is not only cryptic but hard to remember when you’re trying to find the column by internal name. While “Expense Reports 2011” is user friendly, “ExpenseReports2011” is not (unless you’re a programmer). So that’s not the solution. Well, not entirely. Instead when you create your column or list or view use the scrunched up name (I can’t think of the technical term for it right now) of “ExpenseReports2011”, “WomenAtTheOfficeThatAreMen” or “KoalaMeatIsGoodWhenBroiled”. After you’ve created it, go back and change the name to the more friendly “Silly Expense Reports That Nobody Reads”. The original internal name will be the url and code friendly one without spaces while the one used on data entry forms and view headers will be the human version. Smart Columns When building a view include columns that make sense. By default when you add a column the “Add to default view” is checked. Resist the urge to be lazy and leave it checked. Uncheck that puppy and decide consciously what columns should be included in the view. Pick columns that make sense to what the user is trying to do. This means you have to talk to the user. Yes, I know. That can be trying at times and even painful. Go ahead, talk to them. You might learn something. Find out what’s important to them and why. If they’re doing something repetitively as part of their job, try to make their life easier by including what’s most important to them. Do they really need to see the Created *and* Modified date of a document or do they just need the title and author? You’ll only find out after talking to them (or getting them drunk in a bar and leaving them in the back alley handcuffed to a garbage bin, don’t ask). Gotta Keep it Separated Hey, views are there for a reason. Use them. While “All Items” is a fine way to present a list of well, all items, it’s hardly sufficient to present a list of servers built before the Y2K bug hit. You’ll be scrolling the list for hours finally arriving at Page 387 of 12,591 and cursing that SharePoint guy for convincing you that putting your hardware into a list would be of any use to anyone. Next to collecting the data, presenting it is just as important. Views are often overlooked and many times ignored or misused. They’re the way you can slice and dice the data up so that you’re not trying to consume 3,000 years of human evolution on a single web page. Remember views can be filtered so feel free to create a view for each status or one for each operating system or one for each species of Information Worker you might be putting in that list or document library. Not only will it reduce the number of items someone sees at one time, it’ll also make the information that much more relevant. Also remember that each view is a separate page. Use it in navigation by creating a menu on the Quick Launch to each view. The discoverability of the Views menu isn’t overly obvious and if you violate the rule of columns (see Horizontally Scrolling below) the view menu doesn’t even show up until you shuffle the scroll bar to the left. Navigation links, big giant buttons, a screaming flashing “CLICK ME NOW” will help your users find their way. Sort It! Views are great so we’re building nice, rich views for the user. Awesomesauce. However sort is not very discoverable by the user. For example when you’re looking at a view how do you know if it’s ascending or descending and what is it sorted on. Maybe it’s sorted using two fields so what’s that all about? Help your users by letting them know the information they’re looking at is sorted. Maybe you name the view something appropriate like “Bogus Expense Claims Sorted By Deadbeats”. If you use the naming strategy just make sure you keep the name consistent with the description. In the previous example their better be a Deadbeat column so I can see the sort in action. Having a “Loser” column, while equally correct, is a little obtuse to the average Information Worker. Remember, they usually don’t use acronyms and even if they knew how to, it’s not immediately obvious to them that’s what you’re trying to convey. Another option is to simply drop a Content Editor Web Part above the list and explain exactly the view they’re looking at. Each view is it’s own page so one CEWP won’t be used across the board. Be descriptive in what the user is seeing but try to keep it brief. Dumping the first chapter of I, Claudius might be informative to the data but can gobble up screen real estate and miss the point of having the list. DO NOT Useless Attachments The attachments column is, in a word, useless. For the most part. Sure it indicates there’s an attachment on the list item but in the grand scheme of things that’s not overly informative. Maybe it is and by all means, if it makes sense to you include it. Colour it. Make it shine and stand like the Return of Clippy on every SharePoint list. Without it being functional it can be boring. EndUserSharePoint.com has an article to make the son of Clippy that much more useful so feel free to head over and check out this blog post by Paul Grenier on the task (Warning code ahead! Danger Will Robinson!) In any case, I would suggest you remove it from your views. Again if it’s important then include it but consider the jQuery solution above to make it functional. It’s added by default to views and one of things that people forget to clean up. Horizontal Scrolling Screen real estate is premium so building a list that contains 8,000 columns and stretches horizontally across 15 screens probably isn’t the most user friendly experience. Most users can’t figure out how to scroll vertically let alone horizontally so don’t make it even that more confusing for them. Take the Steve Krug approach in your view designs and try not to make the user think. Again views are your friend. Consider splitting up the data into views where one view contains 10 columns and other view contains the other 10. Okay, maybe your information doesn’t work that way but humans can only process 7 pieces of data at a time, 10 at most (then their heads explode and you don’t want to clean that mess up, especially on a Friday night before the big dance). It drives me batshit crazy when I see a view with 80 columns of data. I often ask the user “So what do you do with all this information”. The response is usually “With this data [the first 10 columns] I decide if I’m going to fire everyone, and with this data [the next 10 columns] I decide if I’m going to set the building on fire and collect the insurance”. It’s at that point I show them how to create two new views “People Who Are About To Get The Axe” and “Beach Time For The Executives”. Again, talk to your users and try to reason with them on cutting down the number of columns they see at once. Vertical Scrolling Another big faux pas I find is the use of multi-line comment fields in views. It’s not so bad when you have a statement like this in your view: “I really like, oh my god, thought I was going to scream when I saw this turtle then I decided what I was going to have for dinner and frankly I hate having to work late so when I was talking to the customer I thought, oh my god, what if the customer has turtles and then it appeared to me that I really was hungry so I'm going to have lunch now.” It’s fine if that’s the only column along with two or three others, but once you slap those 20 columns of data into the list, the comment field wraps and forms a new multi-page novel that takes up your entire screen. Do everyone a favour and just avoid adding the column to views. Train the user to just click through to the item if they need to see the contents. Duplicate Information Duplication is never good. Views and great as you can group data together. For example create a view of project status reports grouped by author. Then you can see what project manager is being a dip and not submitting their report. However if you group by author do you really need the Created By field as well in the view? Or if the view is grouped by Project then Author do you need both. Horizontal real estate is always at a premium so try not to clutter up the view with duplicate data like this. Oh yeah, if you’re scratching your head saying “But Bil, if I don’t include the Project name in the view and I have a lot of items then how do I know which one I’m looking at”. That’s a hint that your grouping is too vague or you have too much data in the view based on that criteria. Filter it down a notch, create some views, and try to keep the group down to a single screen where you can see the group header at the top of the page. Again it’s just managing the information you have. Redundant, See Redundant This partially relates to duplicate information and smart columns but basically remember to not include the obvious in a view. Remember, don’t make me think. If you’ve gone to the trouble (and it was a lot of trouble wasn’t it?) to create separate views of your data by creating a “September Zombie Brain Sales”, “October Zombie Brain Sales”, etc. then please for the love of all that is holy do not include the Month and Product columns in your view. Similarly if you create a “My” view of anything (“My Favourite Brands of Spandex”, “My Co-Workers I Find The Urge To Disinfect”) then again, do not include the owner or author field (or whatever field you use to identify “My”). That’s just silly. Hope that helps! Happy customizing!

Read the article

Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article

Caching NHibernate Named Queries

- by TStewartDev

I recently started a new job and one of my first tasks was to implement a "popular products" design. The parameters were that it be done with NHibernate and be cached for 24 hours at a time because the query will be pretty taxing and the results do not need to be constantly up to date. This ended up being tougher than it sounds. The database schema meant a minimum of four joins with filtering and ordering criteria. I decided to use a stored procedure rather than letting NHibernate create the SQL for me. Here is a summary of what I learned (even if I didn't ultimately use all of it): You can't, at the time of this writing, use Fluent NHibernate to configure SQL named queries or imports You can return persistent entities from a stored procedure and there are a couple ways to do that You can populate POCOs using the results of a stored procedure, but it isn't quite as obvious You can reuse your named query result mapping other places (avoid duplication) Caching your query results is not at all obvious Testing to see if your cache is working is a pain NHibernate does a lot of things right. Having unified, up-to-date, comprehensive, and easy-to-find documentation is not one of them. By the way, if you're new to this, I'll use the terms "named query" and "stored procedure" (from NHibernate's perspective) fairly interchangeably. Technically, a named query can execute any SQL, not just a stored procedure, and a stored procedure doesn't have to be executed from a named query, but for reusability, it seems to me like the best practice. If you're here, chances are good you're looking for answers to a similar problem. You don't want to read about the path, you just want the result. So, here's how to get this thing going. The Stored Procedure NHibernate has some guidelines when using stored procedures. For Microsoft SQL Server, you have to return a result set. The scalar value that the stored procedure returns is ignored as are any result sets after the first. Other than that, it's nothing special. CREATE PROCEDURE GetPopularProducts @StartDate DATETIME, @MaxResults INT AS BEGIN SELECT [ProductId], [ProductName], [ImageUrl] FROM SomeTableWithJoinsEtc END The Result Class - PopularProduct You have two options to transport your query results to your view (or wherever is the final destination): you can populate an existing mapped entity class in your model, or you can create a new entity class. If you go with the existing model, the advantage is that the query will act as a loader and you'll get full proxied access to the domain model. However, this can be a disadvantage if you require access to the related entities that aren't loaded by your results. For example, my PopularProduct has image references. Unless I tie them into the query (thus making it even more complicated and expensive to run), they'll have to be loaded on access, requiring more trips to the database. Since we're trying to avoid trips to the database by using a second-level cache, we should use the second option, which is to create a separate entity for results. This approach is (I believe) in the spirit of the Command-Query Separation principle, and it allows us to flatten our data and optimize our report-generation process from data source to view. public class PopularProduct { public virtual int ProductId { get; set; } public virtual string ProductName { get; set; } public virtual string ImageUrl { get; set; } } The NHibernate Mappings (hbm) Next up, we need to let NHibernate know about the query and where the results will go. Below is the markup for the PopularProduct class. Notice that I'm using the <resultset> element and that it has a name attribute. The name allows us to drop this into our query map and any others, giving us reusability. Also notice the <import> element which lets NHibernate know about our entity class. <?xml version="1.0" encoding="utf-8" ?> <hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"> <import class="PopularProduct, Infrastructure.NHibernate, Version=1.0.0.0"/> <resultset name="PopularProductResultSet"> <return-scalar column="ProductId" type="System.Int32"/> <return-scalar column="ProductName" type="System.String"/> <return-scalar column="ImageUrl" type="System.String"/> </resultset> </hibernate-mapping> And now the PopularProductsMap: <?xml version="1.0" encoding="utf-8" ?> <hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"> <sql-query name="GetPopularProducts" resultset-ref="PopularProductResultSet" cacheable="true" cache-mode="normal"> <query-param name="StartDate" type="System.DateTime" /> <query-param name="MaxResults" type="System.Int32" /> exec GetPopularProducts @StartDate = :StartDate, @MaxResults = :MaxResults </sql-query> </hibernate-mapping> The two most important things to notice here are the resultset-ref attribute, which links in our resultset mapping, and the cacheable attribute. The Query Class – PopularProductsQuery So far, this has been fairly obvious if you're familiar with NHibernate. This next part, maybe not so much. You can implement your query however you want to; for me, I wanted a self-encapsulated Query class, so here's what it looks like: public class PopularProductsQuery : IPopularProductsQuery { private static readonly IResultTransformer ResultTransformer; private readonly ISessionBuilder _sessionBuilder; static PopularProductsQuery() { ResultTransformer = Transformers.AliasToBean<PopularProduct>(); } public PopularProductsQuery(ISessionBuilder sessionBuilder) { _sessionBuilder = sessionBuilder; } public IList<PopularProduct> GetPopularProducts(DateTime startDate, int maxResults) { var session = _sessionBuilder.GetSession(); var popularProducts = session .GetNamedQuery("GetPopularProducts") .SetCacheable(true) .SetCacheRegion("PopularProductsCacheRegion") .SetCacheMode(CacheMode.Normal) .SetReadOnly(true) .SetResultTransformer(ResultTransformer) .SetParameter("StartDate", startDate.Date) .SetParameter("MaxResults", maxResults) .List<PopularProduct>(); return popularProducts; } } Okay, so let's look at each line of the query execution. The first, GetNamedQuery, matches up with our NHibernate mapping for the sql-query. Next, we set it as cacheable (this is probably redundant since our mapping also specified it, but it can't hurt, right?). Then we set the cache region which we'll get to in the next section. Set the cache mode (optional, I believe), and my cache is read-only, so I set that as well. The result transformer is very important. This tells NHibernate how to transform your query results into a non-persistent entity. You can see I've defined ResultTransformer in the static constructor using the AliasToBean transformer. The name is obviously leftover from Java/Hibernate. Finally, set your parameters and then call a result method which will execute the query. Because this is set to cached, you execute this statement every time you run the query and NHibernate will know based on your parameters whether to use its cached version or a fresh version. The Configuration – hibernate.cfg.xml and Web.config You need to explicitly enable second-level caching in your hibernate configuration: <hibernate-configuration xmlns="urn:nhibernate-configuration-2.2"> <session-factory> [...] <property name="dialect">NHibernate.Dialect.MsSql2005Dialect</property> <property name="cache.provider_class">NHibernate.Caches.SysCache.SysCacheProvider,NHibernate.Caches.SysCache</property> <property name="cache.use_query_cache">true</property> <property name="cache.use_second_level_cache">true</property> [...] </session-factory> </hibernate-configuration> Both properties "use_query_cache" and "use_second_level_cache" are necessary. As this is for a web deployement, we're using SysCache which relies on ASP.NET's caching. Be aware of this if you're not deploying to the web! You'll have to use a different cache provider. We also need to tell our cache provider (in this cache, SysCache) about our caching region: <syscache> <cache region="PopularProductsCacheRegion" expiration="86400" priority="5" /> </syscache> Here I've set the cache to be valid for 24 hours. This XML snippet goes in your Web.config (or in a separate file referenced by Web.config, which helps keep things tidy). The Payoff That should be it! At this point, your queries should run once against the database for a given set of parameters and then use the cache thereafter until it expires. You can, of course, adjust settings to work in your particular environment. Testing Testing your application to ensure it is using the cache is a pain, but if you're like me, you want to know that it's actually working. It's a bit involved, though, so I'll create a separate post for it if comments indicate there is interest.

Read the article

Performance Enhancement in Full-Text Search Query

- by Calvin Sun

Ever since its first release, we are continuing consolidating and developing InnoDB Full-Text Search feature. There is one recent improvement that worth blogging about. It is an effort with MySQL Optimizer team that simplifies some common queries’ Query Plans and dramatically shorted the query time. I will describe the issue, our solution and the end result by some performance numbers to demonstrate our efforts in continuing enhancement the Full-Text Search capability. The Issue: As we had discussed in previous Blogs, InnoDB implements Full-Text index as reversed auxiliary tables. The query once parsed will be reinterpreted into several queries into related auxiliary tables and then results are merged and consolidated to come up with the final result. So at the end of the query, we’ll have all matching records on hand, sorted by their ranking or by their Doc IDs. Unfortunately, MySQL’s optimizer and query processing had been initially designed for MyISAM Full-Text index, and sometimes did not fully utilize the complete result package from InnoDB. Here are a couple examples: Case 1: Query result ordered by Rank with only top N results: mysql> SELECT FTS_DOC_ID, MATCH (title, body) AGAINST ('database') AS SCORE FROM articles ORDER BY score DESC LIMIT 1; In this query, user tries to retrieve a single record with highest ranking. It should have a quick answer once we have all the matching documents on hand, especially if there are ranked. However, before this change, MySQL would almost retrieve rankings for almost every row in the table, sort them and them come with the top rank result. This whole retrieve and sort is quite unnecessary given the InnoDB already have the answer. In a real life case, user could have millions of rows, so in the old scheme, it would retrieve millions of rows' ranking and sort them, even if our FTS already found there are two 3 matched rows. Apparently, the million ranking retrieve is done in vain. In above case, it should just ask for 3 matched rows' ranking, all other rows' ranking are 0. If it want the top ranking, then it can just get the first record from our already sorted result. Case 2: Select Count(*) on matching records: mysql> SELECT COUNT(*) FROM articles WHERE MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE); In this case, InnoDB search can find matching rows quickly and will have all matching rows. However, before our change, in the old scheme, every row in the table was requested by MySQL one by one, just to check whether its ranking is larger than 0, and later comes up a count. In fact, there is no need for MySQL to fetch all rows, instead InnoDB already had all the matching records. The only thing need is to call an InnoDB API to retrieve the count The difference can be huge. Following query output shows how big the difference can be: mysql> select count(*) from searchindex_inno where match(si_title, si_text) against ('people') +----------+ | count(*) | +----------+ | 666877 | +----------+ 1 row in set (16 min 17.37 sec) So the query took almost 16 minutes. Let’s see how long the InnoDB can come up the result. In InnoDB, you can obtain extra diagnostic printout by turning on “innodb_ft_enable_diag_print”, this will print out extra query info: Error log: keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 2 secs: row(s) 666877: error: 10 ft_init() ft_init_ext() keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 3 secs: row(s) 666877: error: 10 Output shows it only took InnoDB only 3 seconds to get the result, while the whole query took 16 minutes to finish. So large amount of time has been wasted on the un-needed row fetching. The Solution: The solution is obvious. MySQL can skip some of its steps, optimize its plan and obtain useful information directly from InnoDB. Some of savings from doing this include: 1) Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to sort to get top matching results. 2) Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list should all have ranking of 0, and no need to be retrieved. And InnoDB has a count of total matching records on hand. No need to recount. 3) Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and ranking is needed, there is no need to go to user table to fetch the record itself. 4) Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch all matching records from user table. We should be able to first select TOP N matching DOC IDs, and then only fetch corresponding records with these Doc IDs. Performance Results and comparison with MyISAM The result by this change is very obvious. I includes six testing result performed by Alexander Rubin just to demonstrate how fast the InnoDB query now becomes when comparing MyISAM Full-Text Search. These tests are base on the English Wikipedia data of 5.4 Million rows and approximately 16G table. The test was performed on a machine with 1 CPU Dual Core, SSD drive, 8G of RAM and InnoDB_buffer_pool is set to 8 GB. Table 1: SELECT with LIMIT CLAUSE mysql> SELECT si_title, match(si_title, si_text) against('family') as rel FROM si WHERE match(si_title, si_text) against('family') ORDER BY rel desc LIMIT 10; InnoDB MyISAM Times Faster Time for the query 1.63 sec 3 min 26.31 sec 127 You can see for this particular query (retrieve top 10 records), InnoDB Full-Text Search is now approximately 127 times faster than MyISAM. Table 2: SELECT COUNT QUERY mysql>select count(*) from si where match(si_title, si_text) against('family‘); +----------+ | count(*) | +----------+ | 293955 | +----------+ InnoDB MyISAM Times Faster Time for the query 1.35 sec 28 min 59.59 sec 1289 In this particular case, where there are 293k matching results, InnoDB took only 1.35 second to get all of them, while take MyISAM almost half an hour, that is about 1289 times faster!. Table 3: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county California 0.93 sec 32.03 sec 34.4 President united states of America 2.5 sec 36.98 sec 14.8 Table 4: SELECT title and text with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, si_title, si_text, ... as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.61 sec 41.65 sec 68.3 family film 1.15 sec 47.17 sec 41.0 Pizza restaurant orange county california 1.03 sec 48.2 sec 46.8 President united states of america 2.49 sec 44.61 sec 17.9 Table 5: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county califormia 0.93 sec 32.03 sec 34.4 President united states of america 2.5 sec 36.98 sec 14.8 Table 6: SELECT COUNT(*) mysql> SELECT count(*) FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.47 sec 82 sec 174.5 family film 0.83 sec 131 sec 157.8 Pizza restaurant orange county califormia 0.74 sec 106 sec 143.2 President united states of america 1.96 sec 220 sec 112.2 Again, table 3 to table 6 all showing InnoDB consistently outperform MyISAM in these queries by a large margin. It becomes obvious the InnoDB has great advantage over MyISAM in handling large data search. Summary: These results demonstrate the great performance we could achieve by making MySQL optimizer and InnoDB Full-Text Search more tightly coupled. I think there are still many cases that InnoDB’s result info have not been fully taken advantage of, which means we still have great room to improve. And we will continuously explore the area, and get more dramatic results for InnoDB full-text searches. Jimmy Yang, September 29, 2012

Read the article

Why unhandled exceptions are useful

- by Simon Cooper

It’s the bane of most programmers’ lives – an unhandled exception causes your application or webapp to crash, an ugly dialog gets displayed to the user, and they come complaining to you. Then, somehow, you need to figure out what went wrong. Hopefully, you’ve got a log file, or some other way of reporting unhandled exceptions (obligatory employer plug: SmartAssembly reports an application’s unhandled exceptions straight to you, along with the entire state of the stack and variables at that point). If not, you have to try and replicate it yourself, or do some psychic debugging to try and figure out what’s wrong. However, it’s good that the program crashed. Or, more precisely, it is correct behaviour. An unhandled exception in your application means that, somewhere in your code, there is an assumption that you made that is actually invalid. Coding assumptions Let me explain a bit more. Every method, every line of code you write, depends on implicit assumptions that you have made. Take this following simple method, that copies a collection to an array and includes an item if it isn’t in the collection already, using a supplied IEqualityComparer: public static T[] ToArrayWithItem( ICollection<T> coll, T obj, IEqualityComparer<T> comparer) { // check if the object is in collection already // using the supplied comparer foreach (var item in coll) { if (comparer.Equals(item, obj)) { // it's in the collection already // simply copy the collection to an array // and return it T[] array = new T[coll.Count]; coll.CopyTo(array, 0); return array; } } // not in the collection // copy coll to an array, and add obj to it // then return it T[] array = new T[coll.Count+1]; coll.CopyTo(array, 0); array[array.Length-1] = obj; return array; } What’s all the assumptions made by this fairly simple bit of code? coll is never null comparer is never null coll.CopyTo(array, 0) will copy all the items in the collection into the array, in the order defined for the collection, starting at the first item in the array. The enumerator for coll returns all the items in the collection, in the order defined for the collection comparer.Equals returns true if the items are equal (for whatever definition of ‘equal’ the comparer uses), false otherwise comparer.Equals, coll.CopyTo, and the coll enumerator will never throw an exception or hang for any possible input and any possible values of T coll will have less than 4 billion items in it (this is a built-in limit of the CLR) array won’t be more than 2GB, both on 32 and 64-bit systems, for any possible values of T (again, a limit of the CLR) There are no threads that will modify coll while this method is running and, more esoterically: The C# compiler will compile this code to IL according to the C# specification The CLR and JIT compiler will produce machine code to execute the IL on the user’s computer The computer will execute the machine code correctly That’s a lot of assumptions. Now, it could be that all these assumptions are valid for the situations this method is called. But if this does crash out with an exception, or crash later on, then that shows one of the assumptions has been invalidated somehow. An unhandled exception shows that your code is running in a situation which you did not anticipate, and there is something about how your code runs that you do not understand. Debugging the problem is the process of learning more about the new situation and how your code interacts with it. When you understand the problem, the solution is (usually) obvious. The solution may be a one-line fix, the rewrite of a method or class, or a large-scale refactoring of the codebase, but whatever it is, the fix for the crash will incorporate the new information you’ve gained about your own code, along with the modified assumptions. When code is running with an assumption or invariant it depended on broken, then the result is ‘undefined behaviour’. Anything can happen, up to and including formatting the entire disk or making the user’s computer sentient and start doing a good impression of Skynet. You might think that those can’t happen, but at Halting problem levels of generality, as soon as an assumption the code depended on is broken, the program can do anything. That is why it’s important to fail-fast and stop the program as soon as an invariant is broken, to minimise the damage that is done. What does this mean in practice? To start with, document and check your assumptions. As with most things, there is a level of judgement required. How you check and document your assumptions depends on how the code is used (that’s some more assumptions you’ve made), how likely it is a method will be passed invalid arguments or called in an invalid state, how likely it is the assumptions will be broken, how expensive it is to check the assumptions, and how bad things are likely to get if the assumptions are broken. Now, some assumptions you can assume unless proven otherwise. You can safely assume the C# compiler, CLR, and computer all run the method correctly, unless you have evidence of a compiler, CLR or processor bug. You can also assume that interface implementations work the way you expect them to; implementing an interface is more than simply declaring methods with certain signatures in your type. The behaviour of those methods, and how they work, is part of the interface contract as well. For example, for members of a public API, it is very important to document your assumptions and check your state before running the bulk of the method, throwing ArgumentException, ArgumentNullException, InvalidOperationException, or another exception type as appropriate if the input or state is wrong. For internal and private methods, it is less important. If a private method expects collection items in a certain order, then you don’t necessarily need to explicitly check it in code, but you can add comments or documentation specifying what state you expect the collection to be in at a certain point. That way, anyone debugging your code can immediately see what’s wrong if this does ever become an issue. You can also use DEBUG preprocessor blocks and Debug.Assert to document and check your assumptions without incurring a performance hit in release builds. On my coding soapbox… A few pet peeves of mine around assumptions. Firstly, catch-all try blocks: try { ... } catch { } A catch-all hides exceptions generated by broken assumptions, and lets the program carry on in an unknown state. Later, an exception is likely to be generated due to further broken assumptions due to the unknown state, causing difficulties when debugging as the catch-all has hidden the original problem. It’s much better to let the program crash straight away, so you know where the problem is. You should only use a catch-all if you are sure that any exception generated in the try block is safe to ignore. That’s a pretty big ask! Secondly, using as when you should be casting. Doing this: (obj as IFoo).Method(); or this: IFoo foo = obj as IFoo; ... foo.Method(); when you should be doing this: ((IFoo)obj).Method(); or this: IFoo foo = (IFoo)obj; ... foo.Method(); There’s an assumption here that obj will always implement IFoo. If it doesn’t, then by using as instead of a cast you’ve turned an obvious InvalidCastException at the point of the cast that will probably tell you what type obj actually is, into a non-obvious NullReferenceException at some later point that gives you no information at all. If you believe obj is always an IFoo, then say so in code! Let it fail-fast if not, then it’s far easier to figure out what’s wrong. Thirdly, document your assumptions. If an algorithm depends on a non-trivial relationship between several objects or variables, then say so. A single-line comment will do. Don’t leave it up to whoever’s debugging your code after you to figure it out. Conclusion It’s better to crash out and fail-fast when an assumption is broken. If it doesn’t, then there’s likely to be further crashes along the way that hide the original problem. Or, even worse, your program will be running in an undefined state, where anything can happen. Unhandled exceptions aren’t good per-se, but they give you some very useful information about your code that you didn’t know before. And that can only be a good thing.

Search Results

Search found 15499 results on 620 pages for 'non obvious'.

Page 127/620 | < Previous Page | 123 124 125 126 127 128 129 130 131 132 133 134 | Next Page >

- by Ringerrr

- by jdk

- by jpatokal

- by Guillaume Santacruz

- by Robert

- by jth

- by ooo

- by jonswar

- by TheObserver

- by Joe

- by smf68

- by sftrabbit

- by GBBL

- by ADTC

- by Daniel Vaca

- by Lo'oris

- by user1849286

- by ms. sakura

- by acidzombie24

- by extended_events

- by Bil Simser

- by extended_events

- by TStewartDev

- by Calvin Sun

- by Simon Cooper

< Previous Page | 123 124 125 126 127 128 129 130 131 132 133 134 | Next Page >