Search Results

Search found 10991 results on 440 pages for 'stable sort'.

Page 77/440 | < Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >

  • Visual Studio 2010 Beta 2 Startup Failures

    - by Rick Strahl
    I’ve been working with VS 2010 Beta 2 for a while now and while it works Ok most of the time it seems the environment is very, very fragile when it comes to crashes and installed packages. Specifically I’ve been working just fine for days, then when VS 2010 crashes it will not re-start. Instead I get the good old Application cannot start dialog: Other failures I’ve seen bring forth other just as useful dialogs with information overload like Operation cannot be performed which for me specifically happens when trying to compile any project. After a bit of digging around and a post to Microsoft Connect the solution boils down to resetting the VS.NET environment. The Application Cannot Start issue stems from a package load failure of some sort, so the work around for this is typically: c:\program files\Visual Studio 2010\Common7\IDE\devenv.exe /ResetSkipPkgs In most cases that should do the trick. If it doesn’t and the error doesn’t go away the more drastic: c:\program files\Visual Studio 2010\Common7\IDE\devenv.exe /ResetSettings is required which resets all settings in VS to its installation defaults. Between these two I’ve always been able to get VS to startup and run properly. BTW it’s handy to keep a list of command line options for Visual Studio around: http://msdn.microsoft.com/en-us/library/xee0c8y7%28VS.100%29.aspx Note that the /? option in VS 2010 doesn’t display all the options available but rather displays the ‘demo version’ message instead, so the above should be helpful. Also note that unless you install Visual C++ the Visual Studio Command Prompt icon is not automatically installed so you may have to navigate manually to the appropriate folder above. Cannot Build Failures If you get the Cannot compile error dialog, there is another thing that have worked for me: Change your project build target from Debug to Release (or whatever – just change it) and compile again. If that doesn’t work doing the reset steps above will do it for me. It appears this failure comes from some sort of interference of other versions of Visual Studio installed on the system and running another version first. Resetting the build target explicitly seems to reset the build providers to a normalized state so that things work in many cases. But not all. Worst case – resetting settings will do it. The bottom line for working in VS 2010 has been – don’t get too attached to your custom settings as they will get blown away quite a bit. I’ve probably been through 20 or more of these VS resets although I’ve been working with it quite a bit on an internal project. It’s kind of frustrating to see this kind of high level instability in a Beta 2 product which is supposedly the last public beta they will put out. On the other hand this beta has been otherwise rather stable and performance is roughly equivalent to VS 2008. Although I mention the crash above – crashes I’ve seen have been relatively rare and no more frequent than in VS 2008 it seems. Given the drastic UI changes in VS 2010 (using WPF for the shell and editor) I’m actually impressed that the product is as stable as it is at this point. Also I was seriously worried about text quality going to a WPF model, but thankfully WPF 4.0 addresses the blurry text issue with native font rendering to render text on non-cleartype enabled systems crisply. Anyway I hope that these notes are helpful to some of you playing around with the beta and running into problems. Hopefully you won’t need them :-}© Rick Strahl, West Wind Technologies, 2005-2010

    Read the article

  • Logical Domain Modeling Made Simple

    - by Knut Vatsendvik
    How can logical domain modeling be made simple and collaborative? Many non-technical end-users, managers and business domain experts find it difficult to understand the visual models offered by many UML tools. This creates trouble in capturing and verifying the information that goes into a logical domain model. The tools are also too advanced and complex for a non-technical user to learn and use. We have therefore, in our current project, ended up with using Confluence as tool for designing the logical domain model with the help of a few very useful plugins. Big thanks to Ole Nymoen and Per Spilling for their expertise in this field that made this posting possible. Confluence Plugins Here is a list of Confluence plugins used in this solution. Install these before trying out the macros used below. Plugin Description Copy Space Allows a space administrator to copy a space, including the pages within the space Metadata Supports adding metadata to Wiki pages Label Manages labeling of pages Linking Contains macros for linking to templates, the dashboard and other Table Enhances the table capability in Confluence Creating a Confluence Space First we need to create a new confluence space for the domain model. Click the link Create a Space located below the list of spaces on the Dashboard. Please contact your Confluence administrator is you do not have permissions to do this.   For illustrative purpose all attributes and entities in this posting are based on my imaginary project manager domain model. When a logical domain model is good enough for being implemented, do a copy of the Confluence Space (see Copy Space plugin). In this way you create a stable version of the logical domain model while further design can continue with the new copied space. Typical will the implementation phase result in a database design and/or a XSD schema design. Add Space Templates Go to the Home page of your Confluence Space. Navigate to the Browse drop-down menu and click on Advanced. Then click the Templates option in the left navigation panel. Click Add New Space Template to add the following three templates. Name: attribute {metadata-list} || Name | | || Type | | || Format | | || Description | | {metadata-list} {add-label:attribute} Name: primary-type {metadata-list} || Name | || || Type | || || Format | || || Description | || {metadata-list} {add-label:primary-type} Name: complex-type {metadata-list} || Name | || || Description |  || {metadata-list} h3. Attributes || Name || Type || Format || Description || | [name] | {metadata-from:name|Type} | {metadata-from:name|Format} | {metadata-from:name|Description} | {add-label:complex-type,entity} The metadata-list macro (see Metadata plugin) will save a list of metadata values to the page. The add-label macro (see Label plugin) will automatically label the page. Primary Types Page Our first page to add will act as container for our primary types. Switch to Wiki markup when adding the following content to the page. | (+) {add-page:template=primary-type|parent=@self}Add new primary type{add-page} | {metadata-report:Name,Type,Format,Description|sort=Name|root=@self|pages=@descendents} Once the page is created, click the Add new primary type (create-page macro) to start creating a new pages. Here is an example of input to the LocalDate page. Embrace the LocalDate with square brackets [] to make the page linkable. Again switch to Wiki markup before editing. {metadata-list} || Name | [LocalDate] || || Type | Date || || Format | YYYY-MM-DD || || Description | Date in local time zone. YYYY = year, MM = month and DD = day || {metadata-list} {add-label:primary-type} The metadata-report macro will show a tabular report of all child pages.   Attributes Page The next page will act as container for all of our attributes. | (+) {add-page:template=attribute|parent=@self|title=attribute}Add new attribute{add-page} | {metadata-report:Name,Type,Format,Description|sort=Name|pages=@descendants} Here is an example of input to the startDate page. {metadata-list} || Name | [startDate] || || Type | [LocalDate] || || Format | {metadata-from:LocalDate|Format} || || Description | The projects start date || {metadata-list} {add-label:attribute} Using the metadata-from macro we fetch the text from the previously created LocalDate page. Complex Types Page The last page in this example shows how attributes can be combined together to form more complex types.   h3. Intro Overview of complex types in the domain model. | (+) {add-page:template=complex-type|parent=@self}Add a new complex type{add-page}\\ | {metadata-report:Name,Description|sort=Name|root=@self|pages=@descendents} Here is an example of input to the ProjectType page. {metadata-list} || Name | [ProjectType] || || Description | Represents a project || {metadata-list} h3. Attributes || Name || Type || Format || Description || | [projectId] | {metadata-from:projectId|Type} | {metadata-from:projectId|Format} | {metadata-from:projectId|Description} | | [name] | {metadata-from:name|Type} | {metadata-from:name|Format} | {metadata-from:name|Description} | | [description] | {metadata-from:description|Type} | {metadata-from:description|Format} | {metadata-from:description|Description} | | [startDate] | {metadata-from:startDate|Type} | {metadata-from:startDate|Format} | {metadata-from:startDate|Description} | {add-label:complex-type,entity} Gives us this Conclusion Using a web-based corporate Wiki like Confluence to create a logical domain model increases the collaboration between people with different roles in the enterprise. It’s my believe that this helps the domain model to be more accurate, and better documented. In our real project we have more pages than illustrated here to complete the documentation. We do also still use UML tools to create different types of diagrams that Confluence do not support. As a last tip, an ImageMap plugin can make those diagrams clickable when used in pages. Enjoy!

    Read the article

  • Talend Enterprise Data Integration overperforms on Oracle SPARC T4

    - by Amir Javanshir
    The SPARC T microprocessor, released in 2005 by Sun Microsystems, and now continued at Oracle, has a good track record in parallel execution and multi-threaded performance. However it was less suited for pure single-threaded workloads. The new SPARC T4 processor is now filling that gap by offering a 5x better single-thread performance over previous generations. Following our long-term relationship with Talend, a fast growing ISV positioned by Gartner in the “Visionaries” quadrant of the “Magic Quadrant for Data Integration Tools”, we decided to test some of their integration components with the T4 chip, more precisely on a T4-1 system, in order to verify first hand if this new processor stands up to its promises. Several tests were performed, mainly focused on: Single-thread performance of the new SPARC T4 processor compared to an older SPARC T2+ processor Overall throughput of the SPARC T4-1 server using multiple threads The tests consisted in reading large amounts of data --ten's of gigabytes--, processing and writing them back to a file or an Oracle 11gR2 database table. They are CPU, memory and IO bound tests. Given the main focus of this project --CPU performance--, bottlenecks were removed as much as possible on the memory and IO sub-systems. When possible, the data to process was put into the ZFS filesystem cache, for instance. Also, two external storage devices were directly attached to the servers under test, each one divided in two ZFS pools for read and write operations. Multi-thread: Testing throughput on the Oracle T4-1 The tests were performed with different number of simultaneous threads (1, 2, 4, 8, 12, 16, 32, 48 and 64) and using different storage devices: Flash, Fibre Channel storage, two stripped internal disks and one single internal disk. All storage devices used ZFS as filesystem and volume management. Each thread read a dedicated 1GB-large file containing 12.5M lines with the following structure: customerID;FirstName;LastName;StreetAddress;City;State;Zip;Cust_Status;Since_DT;Status_DT 1;Ronald;Reagan;South Highway;Santa Fe;Montana;98756;A;04-06-2006;09-08-2008 2;Theodore;Roosevelt;Timberlane Drive;Columbus;Louisiana;75677;A;10-05-2009;27-05-2008 3;Andrew;Madison;S Rustle St;Santa Fe;Arkansas;75677;A;29-04-2005;09-02-2008 4;Dwight;Adams;South Roosevelt Drive;Baton Rouge;Vermont;75677;A;15-02-2004;26-01-2007 […] The following graphs present the results of our tests: Unsurprisingly up to 16 threads, all files fit in the ZFS cache a.k.a L2ARC : once the cache is hot there is no performance difference depending on the underlying storage. From 16 threads upwards however, it is clear that IO becomes a bottleneck, having a good IO subsystem is thus key. Single-disk performance collapses whereas the Sun F5100 and ST6180 arrays allow the T4-1 to scale quite seamlessly. From 32 to 64 threads, the performance is almost constant with just a slow decline. For the database load tests, only the best IO configuration --using external storage devices-- were used, hosting the Oracle table spaces and redo log files. Using the Sun Storage F5100 array allows the T4-1 server to scale up to 48 parallel JVM processes before saturating the CPU. The final result is a staggering 646K lines per second insertion in an Oracle table using 48 parallel threads. Single-thread: Testing the single thread performance Seven different tests were performed on both servers. Given the fact that only one thread, thus one file was read, no IO bottleneck was involved, all data being served from the ZFS cache. Read File ? Filter ? Write File: Read file, filter data, write the filtered data in a new file. The filter is set on the “Status” column: only lines with status set to “A” are selected. This limits each output file to about 500 MB. Read File ? Load Database Table: Read file, insert into a single Oracle table. Average: Read file, compute the average of a numeric column, write the result in a new file. Division & Square Root: Read file, perform a division and square root on a numeric column, write the result data in a new file. Oracle DB Dump: Dump the content of an Oracle table (12.5M rows) into a CSV file. Transform: Read file, transform, write the result data in a new file. The transformations applied are: set the address column to upper case and add an extra column at the end, which is the concatenation of two columns. Sort: Read file, sort a numeric and alpha numeric column, write the result data in a new file. The following table and graph present the final results of the tests: Throughput unit is thousand lines per second processed (K lines/second). Improvement is the % of improvement between the T5140 and T4-1. Test T4-1 (Time s.) T5140 (Time s.) Improvement T4-1 (Throughput) T5140 (Throughput) Read/Filter/Write 125 806 645% 100 16 Read/Load Database 195 1111 570% 64 11 Average 96 557 580% 130 22 Division & Square Root 161 1054 655% 78 12 Oracle DB Dump 164 945 576% 76 13 Transform 159 1124 707% 79 11 Sort 251 1336 532% 50 9 The improvement of single-thread performance is quite dramatic: depending on the tests, the T4 is between 5.4 to 7 times faster than the T2+. It seems clear that the SPARC T4 processor has gone a long way filling the gap in single-thread performance, without sacrifying the multi-threaded capability as it still shows a very impressive scaling on heavy-duty multi-threaded jobs. Finally, as always at Oracle ISV Engineering, we are happy to help our ISV partners test their own applications on our platforms, so don't hesitate to contact us and let's see what the SPARC T4-based systems can do for your application! "As describe in this benchmark, Talend Enterprise Data Integration has overperformed on T4. I was generally happy to see that the T4 gave scaling opportunities for many scenarios like complex aggregations. Row by row insertion in Oracle DB is faster with more than 650,000 rows per seconds without using any bulk Oracle capabilities !" Cedric Carbone, Talend CTO.

    Read the article

  • Ubuntu 12.04 / 12.10 Randomly Freezing - nVidia?

    - by Alix Axel
    My Ubuntu install frequently freezes, sometimes showing a black screen (not very common anymore - in my latest installs), some other times the mouse and keyboard just fail to move and respond (not even Ctrl + Alt + F1 works) and some other times I'm able to move the mouse with a huge delay (2-5 seconds) but I'm not able to do/click anything. I have a pretty strong feeling that this problem is related to my graphic card drivers because: after hard reset, I usually get error reports about X.org / jockey it's common for artifacts to appear during loading / shutdown / whenever, for instance: pattern filled with £ during log off ugly-colored squared pattern during boot windows that are partially moved (i.e.: only the top half) Firefox renderings that leave the bottom ~30% of the page black These artifacts appear right before the system freezes. I've installed Ubuntu 12.04 LTS and after several failed attempts to get my dual monitor setup to work properly I tried installing the new 12.10 version, hoping that this new version would have this problem solved... Unfortunatly, that was not the case, so I reverted to Ubuntu 12.04. I've tried all the drivers in the Additional Drivers application (even the experimental ones), I've also tried the nvidia-current package from the PPA repository ubuntu-x-swat/x-updates as well as the nouveau OSS driver. Nothing (except no driver at all with a 640*480 resolution) at all seems stable. Here is the info of my graphic card: alix@alix-E500:~$ lspci | grep VGA 01:00.0 VGA compatible controller: NVIDIA Corporation G86 [GeForce 8400M G] (rev a1) alix@alix-E500:~$ sudo lshw -C video [sudo] password for alix: *-display description: VGA compatible controller product: G86 [GeForce 8400M G] vendor: NVIDIA Corporation physical id: 0 bus info: pci@0000:01:00.0 version: a1 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress vga_controller bus_master cap_list rom configuration: driver=nouveau latency=0 resources: irq:16 memory:fd000000-fdffffff memory:d0000000-dfffffff memory:fa000000-fbffffff ioport:cc00(size=128) memory:fe0e0000-fe0fffff Right now, I don't even have my 22" monitor connected as I can't even get my laptop display to work properly and without freezes. I've searched, read and tried all that I could (over several fresh reinstalls) to fix the problem, but so far, no solution has proven definitive. I'm sorry I can't precise which symptom maps to each driver but I've been trying to solve this one on my own without logging what I'm doing, perhaps someone here will be able to point me to a certain-fix solution, if not I'll keep updating this question as I go along. Please let me know if any more info is needed to pinpoint the exact problem. Trying out NVIDIA accelerated graphics driver (version 173). The scrolling, minimizing / maximizing windows takes between 2 and 5 seconds to finalize. Context menus also pop up very slowly and the typing seems delayed by ~1 second. No critical issues so far. Firefox rendering of the Save Edits button is consistently messed up (random black lines in the top). Trying out NVIDIA accelerated graphics driver (version current) [Recommended]. All the delays mentioned above and the buggy rendering of the Save Edits button are gone, but I'm noticing that the whole screen flashes black for a couple of microseconds and while I was writing this test for the first time, the bottom 30% of the screen went black and I couldn't do anything (not even Ctrl + Alt + F1 would work). Had to force a hard reset. Also, the system hanged a little for a couple of seconds with the fade out of the "Restart" menu. Trying out NVIDIA accelerated graphics driver (*experimental*beta) (version experimental-304). Same symptoms as before, it crashed once while I was trying to install Chromium and again after a hard reset when I was trying to remove the driver. The bottom of the screen did not went black and I could move my mouse both times. Ctrl + Alt + F1 didn't work. The ugly-colored pattern also showed up during the second boot. Trying out NVIDIA accelerated graphics driver (*experimental*beta) (version experimental-307). The system crashed as soon as I clicked something. Had to do a fresh re-install. Trying out Nouveau: Accelerated Open Source driver for nVidia cards. Artifacts still show up during boot but other than that this one seems stable. As soon as I connected my second monitor, the responsiveness dropped a lot, animations and video are somewhat slow. I'm gonna try this solution http://askubuntu.com/a/98871/9018 later on.

    Read the article

  • How to get bearable 2D and 3D performance on AMD Radeon HD 6950?

    - by l0b0
    I have had an AMD Radeon HD 6950 (i.e., Cayman series) for a couple years now, and I have tried a lot of combinations of drivers and settings with terrible results. I'm completely at a loss as to how to proceed. The open source driver has much better 2D performance, but it offloads all OpenGL rendering to the CPU. What I've tried so far: All the latest stable Ubuntu releases in the period, plus one Linux Mint release. All the latest stable AMD Catalyst Proprietary Display Drivers, and currently 13.1. The unofficial wiki installation instructions for every Ubuntu version and the semi-official Ubuntu instructions. All the tips and tweaks I could find for Minecraft (Optifine, reducing settings to minimum), VLC (postprocessing at minimum, rendering at native video size), Catalyst Control Center (flipped every lever in there) and X11 (some binary toggles I can no longer remember). Results: Typically 13-15 FPS in Minecraft, 30 max (100+ in Windows with the same driver version). Around 10 FPS in Team Fortress 2 using the official Steam client. Choppy video playback, in Flash and with VLC. CPU use goes through the roof when rendering video (150% for 1080p on YouTube in Chromium, 100% for 1080p H264 in VLC). glxgears shows 12.5 FPS when maximized. fgl_glxgears shows 10 FPS when maximized. Hardware details from lshw: Motherboard ASUS P6X58D-E CPU Intel Core i7 CPU 950 @ 3.07GHz (never overclocked; 64 bit) 6 GB RAM Video card product "Cayman PRO [Radeon HD 6950]", vendor "Hynix Semiconductor (Hyundai Electronics)" 2 x 1920x1200 monitors, both connected with HDMI. I feel I must be missing something absolutely fundamental here. Is there no accelerated support for anything on 64-bit architectures? Does a dual monitor completely mess up the driver? $ fglrxinfo display: :0 screen: 0 OpenGL vendor string: Advanced Micro Devices, Inc. OpenGL renderer string: AMD Radeon HD 6900 Series OpenGL version string: 4.2.11995 Compatibility Profile Context $ glxinfo | grep 'direct rendering' direct rendering: Yes I am currently using the open source driver, with the following results: Full frame rate and low CPU load when playing 1080p video. Black screen (but music in the background) in Team Fortress 2. Similar performance in Minecraft as the Catalyst driver. In hindsight obvious, since both end up offloading the rendering to the CPU. My /var/log/Xorg.0.log after upgrading to AMD Catalyst 13.1. Some possibly important lines: (WW) Falling back to old probe method for fglrx (WW) fglrx: No matching Device section for instance (BusID PCI:0@3:0:1) found The generated xorg.conf. The disabled "monitor" 0-DFP9 is actually an A/V receiver, which sometimes confuses the monitor drivers when turned on/off (but not in Windows). All three "monitor" devices are connected with HDMI. Edit: Chris Carter's suggestion to use the xorg-edgers PPA (Catalyst 13.1) resulted in some improvement, but still pretty bad performance overall: Minecraft stabilizes at 13-17 FPS, but at least the CPU load is "only" at 45-60%. Still 150% CPU use for 1080p video rendering on YouTube in Chromium. Massive improvement for 1080p H264 in VLC: 40-50% CPU use and no visible jitter glxgears performance about doubled to 25-30 FPS when maximized. fgl_glxgears still at ~10 FPS when maximized.

    Read the article

  • SQL Table stored as a Heap - the dangers within

    - by MikeD
    Nearly all of the time I create a table, I include a primary key, and often that PK is implemented as a clustered index. Those two don't always have to go together, but in my world they almost always do. On a recent project, I was working on a data warehouse and a set of SSIS packages to import data from an OLTP database into my data warehouse. The data I was importing from the business database into the warehouse was mostly new rows, sometimes updates to existing rows, and sometimes deletes. I decided to use the MERGE statement to implement the insert, update or delete in the data warehouse, I found it quite performant to have a stored procedure that extracted all the new, updated, and deleted rows from the source database and dump it into a working table in my data warehouse, then run a stored proc in the warehouse that was the MERGE statement that took the rows from the working table and updated the real fact table. Use Warehouse CREATE TABLE Integration.MergePolicy (PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date, Operation varchar(5)) CREATE TABLE fact.Policy (PolicyKey int identity primary key, PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date) CREATE PROC Integration.MergePolicy as begin begin tran Merge fact.Policy as tgtUsing Integration.MergePolicy as SrcOn (tgt.PolicyId = Src.PolicyId) When not matched by Target then Insert (PolicyId, PolicyTypeKey, Premium, Deductible, EffectiveDate)values (src.PolicyId, src.PolicyTypeKey, src.Premium, src.Deductible, src.EffectiveDate) When matched and src.Operation = 'U' then Update set PolicyTypeKey = src.PolicyTypeKey,Premium = src.Premium,Deductible = src.Deductible,EffectiveDate = src.EffectiveDate When matched and src.Operation = 'D' then Delete ;delete from Integration.WorkPolicy commit end Notice that my worktable (Integration.MergePolicy) doesn't have any primary key or clustered index. I didn't think this would be a problem, since it was relatively small table and was empty after each time I ran the stored proc. For one of the work tables, during the initial loads of the warehouse, it was getting about 1.5 million rows inserted, processed, then deleted. Also, because of a bug in the extraction process, the same 1.5 million rows (plus a few hundred more each time) was getting inserted, processed, and deleted. This was being sone on a fairly hefty server that was otherwise unused, and no one was paying any attention to the time it was taking. This week I received a backup of this database and loaded it on my laptop to troubleshoot the problem, and of course it took a good ten minutes or more to run the process. However, what seemed strange to me was that after I fixed the problem and happened to run the merge sproc when the work table was completely empty, it still took almost ten minutes to complete. I immediately looked back at the MERGE statement to see if I had some sort of outer join that meant it would be scanning the target table (which had about 2 million rows in it), then turned on the execution plan output to see what was happening under the hood. Running the stored procedure again took a long time, and the plan output didn't show me much - 55% on the MERGE statement, and 45% on the DELETE statement, and table scans on the work table in both places. I was surprised at the relative cost of the DELETE statement, because there were really 0 rows to delete, but I was expecting to see the table scans. (I was beginning now to suspect that my problem was because the work table was being stored as a heap.) Then I turned on STATS_IO and ran the sproc again. The output was quite interesting.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Policy'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'MergePolicy'. Scan count 1, logical reads 433276, physical reads 60, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. I've reproduced the above from memory, the details aren't exact, but the essential bit was the very high number of logical reads on the table stored as a heap. Even just doing a SELECT Count(*) from Integration.MergePolicy incurred that sort of output, even though the result was always 0. I suppose I should research more on the allocation and deallocation of pages to tables stored as a heap, but I haven't, and my original assumption that a table stored as a heap with no rows would only need to read one page to answer any query was definitely proven wrong. It's likely that some sort of physical defragmentation of the table may have cleaned that up, but it seemed that the easiest answer was to put a clustered index on the table. After doing so, the execution plan showed a cluster index scan, and the IO stats showed only a single page read. (I aborted my first attempt at adding a clustered index on the table because it was taking too long - instead I ran TRUNCATE TABLE Integration.MergePolicy first and added the clustered index, both of which took very little time). I suspect I may not have noticed this if I had used TRUNCATE TABLE Integration.MergePolicy instead of DELETE FROM Integration.MergePolicy, since I'm guessing that the truncate operation does some rather quick releasing of pages allocated to the heap table. In the future, I will likely be much more careful to have a clustered index on every table I use, even the working tables. Mike  

    Read the article

  • Part 6: Extensions vs. Modifications

    - by volker.eckardt(at)oracle.com
    Customizations = Extensions + Modifications In the EBS terminology, a customization can be an extension or a modification. Extension means that you mainly create your own code from scratch. You may utilize existing views, packages and java classes, but your code is unique. Modifications are quite different, because here you take existing code and change or enhance certain areas to achieve a slightly different behavior. Important is that it doesn't matter if you place your code at the same or at another place – it is a modification. It is also not relevant if you leave the original code enabled or not! Why? Here is the answer: In case the original code piece you have taken as your base will get patched, you need to copy the source again and apply all your changes once more. If you don't do that, you may get different results or write different data compared to the standard – this causes a high risk! Here are some guidelines how to reduce the risk: Invest a bit longer when searching for objects to select data from. Rather choose a view than a table. In case Oracle development changes the underlying tables, the view will be more stable and is therefore a better choice. Choose rather public APIs over internal APIs. Same background as before: although internal structure might change, the public API is more stable. Use personalization and substitution rather than modification. Spend more time to check if the requirement can be covered with such techniques. Build a project code library, avoid that colleagues creating similar functionality multiple times. Otherwise you have to review lots of similar code to determine the need for correction. Use the technique of “flagged files”. Flagged files is a way to mark a standard deployment file. If you run the patch analyse (within Application Manager), the analyse result will list flagged standard files in case they will be patched. If you maintain a cross reference to your own CEMLIs, you can easily determine which CEMLIs have to be reviewed. Implement a code review process. This can be done by utilizing team internal or external persons. If you implement such a team internal process, your team members will come up with suggestions how to improve the code quality by themselves. Review heavy customizations regularly, to identify options to reduce complexity; let's say perform this every 6th month. You may not spend days for such a review, but a high level cross check if the customization can be reduced is suggested. De-install customizations which are no more required. Define a process for this. Add a section into the technical documentation how to uninstall and what are possible implications. Maintain a cross reference between CEMLIs and between CEMLIs, EBS modules and business processes. Keep this list up to date! Share this list! By following these guidelines, you are able to improve product stability. Although we might not be able to avoid modifications completely, we can give a much better advise to developers and to our test team. Summary: Extensions and Modifications have to be handled differently during their lifecycle. Modifications implicate a much higher risk and should therefore be reviewed more frequently. Good cross references allow you to give clear advise for the testing activities.

    Read the article

  • Breaking 1NF to model subset constraints. Does this sound sane?

    - by Chris Travers
    My first question here. Appologize if it is in the wrong forum but this seems pretty conceptual. I am looking at doing something that goes against conventional wisdom and want to get some feedback as to whether this is totally insane or will result in problems, so critique away! I am on PostgreSQL 9.1 but may be moving to 9.2 for this part of this project. To re-iterate: Does it seem sane to break 1NF in this way? I am not looking for debugging code so much as where people see problems that this might lead. The Problem In double entry accounting, financial transactions are journal entries with an arbitrary number of lines. Each line has either a left value (debit) or a right value (credit) which can be modelled as a single value with negatives as debits and positives as credits or vice versa. The sum of all debits and credits must equal zero (so if we go with a single amount field, sum(amount) must equal zero for each financial journal entry). SQL-based databases, pretty much required for this sort of work, have no way to express this sort of constraint natively and so any approach to enforcing it in the database seems rather complex. The Write Model The journal entries are append only. There is a possibility we will add a delete model but it will be subject to a different set of restrictions and so is not applicable here. If and when we allow deletes, we will probably do them using a simple ON DELETE CASCADE designation on the foreign key, and require that deletes go through a dedicated stored procedure which can enforce the other constraints. So inserts and selects have to be accommodated but updates and deletes do not for this task. My Proposed Solution My proposed solution is to break first normal form and model constraints on arrays of tuples, with a trigger that breaks the rows out into another table. CREATE TABLE journal_line ( entry_id bigserial primary key, account_id int not null references account(id), journal_entry_id bigint not null, -- adding references later amount numeric not null ); I would then add "table methods" to extract debits and credits for reporting purposes: CREATE OR REPLACE FUNCTION debits(journal_line) RETURNS numeric LANGUAGE sql IMMUTABLE AS $$ SELECT CASE WHEN $1.amount < 0 THEN $1.amount * -1 ELSE NULL END; $$; CREATE OR REPLACE FUNCTION credits(journal_line) RETURNS numeric LANGUAGE sql IMMUTABLE AS $$ SELECT CASE WHEN $1.amount > 0 THEN $1.amount ELSE NULL END; $$; Then the journal entry table (simplified for this example): CREATE TABLE journal_entry ( entry_id bigserial primary key, -- no natural keys :-( journal_id int not null references journal(id), date_posted date not null, reference text not null, description text not null, journal_lines journal_line[] not null ); Then a table method and and check constraints: CREATE OR REPLACE FUNCTION running_total(journal_entry) returns numeric language sql immutable as $$ SELECT sum(amount) FROM unnest($1.journal_lines); $$; ALTER TABLE journal_entry ADD CONSTRAINT CHECK (((journal_entry.running_total) = 0)); ALTER TABLE journal_line ADD FOREIGN KEY journal_entry_id REFERENCES journal_entry(entry_id); And finally we'd have a breakout trigger: CREATE OR REPLACE FUNCTION je_breakout() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$ BEGIN IF TG_OP = 'INSERT' THEN INSERT INTO journal_line (journal_entry_id, account_id, amount) SELECT NEW.id, account_id, amount FROM unnest(NEW.journal_lines); RETURN NEW; ELSE RAISE EXCEPTION 'Operation Not Allowed'; END IF; END; $$; And finally CREATE TRIGGER AFTER INSERT OR UPDATE OR DELETE ON journal_entry FOR EACH ROW EXECUTE_PROCEDURE je_breaout(); Of course the example above is simplified. There will be a status table that will track approval status allowing for separation of duties, etc. However the goal here is to prevent unbalanced transactions. Any feedback? Does this sound entirely insane? Standard Solutions? In getting to this point I have to say I have looked at four different current ERP solutions to this problems: Represent every line item as a debit and a credit against different accounts. Use of foreign keys against the line item table to enforce an eventual running total of 0 Use of constraint triggers in PostgreSQL Forcing all validation here solely through the app logic. My concerns are that #1 is pretty limiting and very hard to audit internally. It's not programmer transparent and so it strikes me as being difficult to work with in the future. The second strikes me as being very complex and required a series of contraints and foreign keys against self to make work, and therefore it strikes me as complex, hard to sort out at least in my mind, and thus hard to work with. The fourth could be done as we force all access through stored procedures anyway and this is the most common solution (have the app total things up and throw an error otherwise). However, I think proof that a constraint is followed is superior to test cases, and so the question becomes whether this in fact generates insert anomilies rather than solving them. If this is a solved problem it isn't the case that everyone agrees on the solution....

    Read the article

  • k-d tree implementation [closed]

    - by user466441
    when i run my code and debugged,i got this error - this 0x00093584 {_Myproxy=0x00000000 _Mynextiter=0x00000000 } std::_Iterator_base12 * const - _Myproxy 0x00000000 {_Mycont=??? _Myfirstiter=??? } std::_Container_proxy * _Mycont CXX0017: Error: symbol "" not found _Myfirstiter CXX0030: Error: expression cannot be evaluated + _Mynextiter 0x00000000 {_Myproxy=??? _Mynextiter=??? } std::_Iterator_base12 * but i dont know what does it means,code is this #include<iostream> #include<vector> #include<algorithm> using namespace std; struct point { float x,y; }; vector<point>pointleft(4); vector<point>pointright(4); //we are going to implement two comparison function for x and y coordinates,we need it in calculation of median (we should sort vector //by x or y according to depth informaton,is depth even or odd. bool sortby_X(point &a,point &b) { return a.x<b.x; } bool sortby_Y(point &a,point &b) { return a.y<b.y; } //so i am going to implement to median finding algorithm,one for finding median by x and another find median by y point medianx(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_X); temp=points[(points.size()/2)]; return temp; } point mediany(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_Y); temp=points[(points.size()/2)]; return temp; } //now construct basic tree structure struct Tree { float x,y; Tree(point a) { x=a.x; y=a.y; } Tree *left; Tree *right; }; Tree * build_kd( Tree *root,vector<point>points,int depth) { point temp; if(points.size()==1)// that point is as a leaf { if(root==NULL) root=new Tree(points[0]); return root; } if(depth%2==0) { temp=medianx(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if (points[i].x<temp.x) pointleft[i]=points[i]; else pointright[i]=points[i]; } } else { temp=mediany(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if(points[i].y<temp.y) pointleft[i]=points[i]; else pointright[i]=points[i]; } } return build_kd(root->left,pointleft,depth+1); return build_kd(root->right,pointright,depth+1); } void print(Tree *root) { while(root!=NULL) { cout<<root->x<<" " <<root->y; print(root->left); print(root->right); } } int main() { int depth=0; Tree *root=NULL; vector<point>points(4); float x,y; int n=4; for(int i=0;i<n;i++) { cin>>x>>y; points[i].x=x; points[i].y=y; } root=build_kd(root,points,depth); print(root); return 0; } i am trying ti implement in c++ this pseudo code tuple function build_kd_tree(int depth, set points): if points contains only one point: return that point as a leaf. if depth is even: Calculate the median x-value. Create a set of points (pointsLeft) that have x-values less than the median. Create a set of points (pointsRight) that have x-values greater than or equal to the median. else: Calculate the median y-value. Create a set of points (pointsLeft) that have y-values less than the median. Create a set of points (pointsRight) that have y-values greater than or equal to the median. treeLeft = build_kd_tree(depth + 1, pointsLeft) treeRight = build_kd_tree(depth + 1, pointsRight) return(median, treeLeft, treeRight) please help me what this error means?

    Read the article

  • spliiting code in java-don't know what's wrong [closed]

    - by ???? ?????
    I'm writing a code to split a file into many files with a size specified in the code, and then it will join these parts later. The problem is with the joining code, it doesn't work and I can't figure what is wrong! This is my code: import java.io.*; import java.util.*; public class StupidSplit { static final int Chunk_Size = 10; static int size =0; public static void main(String[] args) throws IOException { String file = "b.txt"; int chunks = DivideFile(file); System.out.print((new File(file)).delete()); System.out.print(JoinFile(file, chunks)); } static boolean JoinFile(String fname, int nChunks) { /* * Joins the chunks together. Chunks have been divided using DivideFile * function so the last part of filename will ".partxxxx" Checks if all * parts are together by matching number of chunks found against * "nChunks", then joins the file otherwise throws an error. */ boolean successful = false; File currentDirectory = new File(System.getProperty("user.dir")); // File[] fileList = currentDirectory.listFiles(); /* populate only the files having extension like "partxxxx" */ List<File> lst = new ArrayList<File>(); // Arrays.sort(fileList); for (File file : fileList) { if (file.isFile()) { String fnm = file.getName(); int lastDot = fnm.lastIndexOf('.'); // add to list which match the name given by "fname" and have //"partxxxx" as extension" if (fnm.substring(0, lastDot).equalsIgnoreCase(fname) && (fnm.substring(lastDot + 1)).substring(0, 4).equals("part")) { lst.add(file); } } } /* * sort the list - it will be sorted by extension only because we have * ensured that list only contains those files that have "fname" and * "part" */ File[] files = (File[]) lst.toArray(new File[0]); Arrays.sort(files); System.out.println("size ="+files.length); System.out.println("hello"); /* Ensure that number of chunks match the length of array */ if (files.length == nChunks-1) { File ofile = new File(fname); FileOutputStream fos; FileInputStream fis; byte[] fileBytes; int bytesRead = 0; try { fos = new FileOutputStream(ofile,true); for (File file : files) { fis = new FileInputStream(file); fileBytes = new byte[(int) file.length()]; bytesRead = fis.read(fileBytes, 0, (int) file.length()); assert(bytesRead == fileBytes.length); assert(bytesRead == (int) file.length()); fos.write(fileBytes); fos.flush(); fileBytes = null; fis.close(); fis = null; } fos.close(); fos = null; } catch (FileNotFoundException fnfe) { System.out.println("Could not find file"); successful = false; return successful; } catch (IOException ioe) { System.out.println("Cannot write to disk"); successful = false; return successful; } /* ensure size of file matches the size given by server */ successful = (ofile.length() == StupidSplit.size) ? true : false; } else { successful = false; } return successful; } static int DivideFile(String fname) { File ifile = new File(fname); FileInputStream fis; String newName; FileOutputStream chunk; //int fileSize = (int) ifile.length(); double fileSize = (double) ifile.length(); //int nChunks = 0, read = 0, readLength = Chunk_Size; int nChunks = 0, read = 0, readLength = Chunk_Size; byte[] byteChunk; try { fis = new FileInputStream(ifile); StupidSplit.size = (int)ifile.length(); while (fileSize > 0) { if (fileSize <= Chunk_Size) { readLength = (int) fileSize; } byteChunk = new byte[readLength]; read = fis.read(byteChunk, 0, readLength); fileSize -= read; assert(read==byteChunk.length); nChunks++; //newName = fname + ".part" + Integer.toString(nChunks - 1); newName = String.format("%s.part%09d", fname, nChunks - 1); chunk = new FileOutputStream(new File(newName)); chunk.write(byteChunk); chunk.flush(); chunk.close(); byteChunk = null; chunk = null; } fis.close(); System.out.println(nChunks); // fis = null; } catch (FileNotFoundException fnfe) { System.out.println("Could not find the given file"); System.exit(-1); } catch (IOException ioe) { System.out .println("Error while creating file chunks. Exiting program"); System.exit(-1); }System.out.println(nChunks); return nChunks; } } }

    Read the article

  • StackOverFlowException - but oviously NO recursion/endless loop

    - by user567706
    Hi there, I'm now blocked by this problem the entire day, read thousands of google results, but nothing seems to reflect my problem or even come near to it... i hope any of you has a push into the right direction for me. I wrote a client-server-application (so more like 2 applications) - the client collects data about his system, as well as a screenshot, serializes all this into a XML stream (the picture as a byte[]-array]) and sends this to the server in regular intervals. The server receives the stream (via tcp), deserializes the xml to an information-object and shows the information on a windows form. This process is running stable for about 20-25 minutes at a submission interval of 3 seconds. When observing the memory usage there's nothing significant to see, also kinda stable. But after these 20-25 mins the server throws a StackOverflowException at the point where it deserializes the tcp-stream, especially when setting the Image property from the byte[]-array. I thoroughly searched for recursive or endless loops, and regarding the fact that it occurs after thousands of sucessfull intervals, i could hardly imagine that. public byte[] ImageBase { get { MemoryStream ms = new MemoryStream(); _screen.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg); return ms.GetBuffer(); } set { if (_screen != null) _screen.Dispose(); //preventing well-known image memory leak MemoryStream ms = new MemoryStream(value); try { _screen = Image.FromStream(ms); //<< EXCEPTION THROWING HERE } catch (StackOverflowException ex) //thx to new CLR management this wont work anymore -.- { Console.WriteLine(ex.Message + Environment.NewLine + ex.StackTrace); } ms.Dispose(); ms = null; } } I hope that more code would be unnecessary, or it could get very complex... Please help, i have no clue at all anymore thx Chris

    Read the article

  • Abusing the word "library"

    - by William Pursell
    I see a lot of questions, both here on SO and elsewhere, about "maintaining common libraries in a VCS". That is, projects foo and bar both depend on libbaz, and the questioner is wondering how they should import the source for libbaz into the VCS for each project. My question is: WTF? If libbaz is a library, then foo doesn't need its source code at all. There are some libraries that are reasonably designed to be used in this manner (eg gnulib), but for the most part foo and bar ought to just link against the library. I guess my thinking is: if you cut-and-paste source for a library into your own source tree, then you obviously don't care about future updates to the library. If you care about updates, then just link against the library and trust the library maintainers to maintain a stable API. If you don't trust the API to remain stable, then you can't blindly update your own copy of the source anyway, so what is gained? To summarize the question: why would anyone want to maintain a copy of a library in the source code for a project rather than just linking against that library and requiring it as a dependency? If the only answer is "don't want the dependency", then why not just distribute a copy of the library along with your app, but keep them totally separate?

    Read the article

  • Merging: hg/git vs. svn

    - by stmax
    I often read that hg (and git and...) are better at merging than svn but I have never seen practical examples of where hg/git can merge something where svn fails (or where svn needs manual intervention). Could you post a few step-by-step lists of branch/modify/commit/...-operations that show where svn would fail while hg/git happily moves on? Practical, not highly exceptional cases please... Some background: we have a few dozen developers working on projects using svn, with each project (or group of similar projects) in its own repo. We know how to apply release- and feature-branches so we don't run into problems very often (i.e. we've been there, but we've learned to overcome joel's problems of "one programmer causing trauma to the whole team" or "needing six developers for two weeks to reintegrate a branch"). We have release-branches that are very stable and only used to apply bugfixes. We have trunks that should be stable enough to be able to create a release within one week. And we have feature-branches that single developers or groups of developers can work on. Yes, they are deleted after reintegration so they don't clutter up the repository. ;) So I'm still trying to find the advantages of hg/git over svn. I'd love to get some hands-on experience, but there aren't any bigger projects we could move to hg/git yet, so I'm stuck with playing with small artifical projects that only contain a few made up files. And I'm looking for a few cases where you can feel the impressive power of hg/git, since so far I have often read about them but failed to find them myself.

    Read the article

  • gevent install on x86_64 fails: "undefined symbol: evhttp_accept_socket"

    - by digitala
    I'm trying to install gevent on a fresh EC2 CentOS 5.3 64-bit system. Since the libevent version available in yum was too old for another package (beanstalkd) I compiled/installed libevent-1.4.13-stable manually using the following command: ./configure --prefix=/usr && make && make install This is the output from installing gevent: [gevent-0.12.2]# python setup.py build --libevent /usr/lib Using libevent 1.4.13-stable: libevent.so running build running build_py running build_ext Linking /usr/src/gevent-0.12.2/build/lib.linux-x86_64-2.6/gevent/core.so to /usr/src/gevent-0.12.2/gevent/core.so [gevent-0.12.2]# cd /path/to/my/project [project]# python myscript.py Traceback (most recent call last): File "myscript.py", line 9, in <module> from gevent.wsgi import WSGIServer as GeventServer File "/usr/lib/python2.6/site-packages/gevent/__init__.py", line 32, in <module> from gevent.core import reinit ImportError: /usr/lib/python2.6/site-packages/gevent/core.so: undefined symbol: evhttp_accept_socket I've followed exactly the same steps on a local VirtualBox instance (32-bit) and I'm not seeing any errors. How would I fix this?

    Read the article

  • Newbie T-SQL dynamic stored procedure -- how can I improve it?

    - by Andy Jones
    I'm new to T-SQL; all my experience is in a completely different database environment (Openedge). I've learned enough to write the procedure below -- but also enough to know that I don't know enough! This routine will have to go into a live environment soon, and it works, but I'm quite certain there are a number of c**k-ups and gotchas in it that I know nothing about. The routine copies data from table A to table B, replacing the data in table B. The tables could be in any database. I plan to call this routine multiple times from another stored procedure. Permissions aren't a problem: the routine will be run by the dba as a timed job. Could I have your suggestions as to how to make it fit best-practice? To bullet-proof it? ALTER PROCEDURE [dbo].[copyTable2Table] @sdb varchar(30), @stable varchar(30), @tdb varchar(30), @ttable varchar(30), @raiseerror bit = 1, @debug bit = 0 as begin set nocount on declare @source varchar(65) declare @target varchar(65) declare @dropstmt varchar(100) declare @insstmt varchar(100) declare @ErrMsg nvarchar(4000) declare @ErrSeverity int set @source = '[' + @sdb + '].[dbo].[' + @stable + ']' set @target = '[' + @tdb + '].[dbo].[' + @ttable + ']' set @dropStmt = 'drop table ' + @target set @insStmt = 'select * into ' + @target + ' from ' + @source set @errMsg = '' set @errSeverity = 0 if @debug = 1 print('Drop:' + @dropStmt + ' Insert:' + @insStmt) -- drop the target table, copy the source table to the target begin try begin transaction exec(@dropStmt) exec(@insStmt) commit end try begin catch if @@trancount > 0 rollback select @errMsg = error_message(), @errSeverity = error_severity() end catch -- update the log table insert into HHG_system.dbo.copyaudit (copytime, copyuser, source, target, errmsg, errseverity) values( getdate(), user_name(user_id()), @source, @target, @errMsg, @errSeverity) if @debug = 1 print ( 'Message:' + @errMsg + ' Severity:' + convert(Char, @errSeverity) ) -- handle errors, return value if @errMsg <> '' begin if @raiseError = 1 raiserror(@errMsg, @errSeverity, 1) return 1 end return 0 END Thanks!

    Read the article

  • PostgreSQL - Error: SQL state: XX000.

    - by rob
    I have a table in Postgres that looks like this: CREATE TABLE "Population" ( "Id" bigint NOT NULL DEFAULT nextval('"population_Id_seq"'::regclass), "Name" character varying(255) NOT NULL, "Description" character varying(1024), "IsVisible" boolean NOT NULL CONSTRAINT "pk_Population" PRIMARY KEY ("Id") ) WITH ( OIDS=FALSE ); And a select function that looks like this: CREATE OR REPLACE FUNCTION "Population_SelectAll"() RETURNS SETOF "Population" AS $BODY$select "Id", "Name", "Description", "IsVisible" from "Population"; $BODY$ LANGUAGE 'sql' STABLE COST 100 Calling the select function returns all the rows in the table as expected. I have a need to add a couple of columns to the table (both of which are foreign keys to other tables in the database). This gives me a new table def as follows: CREATE TABLE "Population" ( "Id" bigint NOT NULL DEFAULT nextval('"population_Id_seq"'::regclass), "Name" character varying(255) NOT NULL, "Description" character varying(1024), "IsVisible" boolean NOT NULL, "DefaultSpeciesId" bigint NOT NULL, "DefaultEcotypeId" bigint NOT NULL, CONSTRAINT "pk_Population" PRIMARY KEY ("Id"), CONSTRAINT "fk_Population_DefaultEcotypeId" FOREIGN KEY ("DefaultEcotypeId") REFERENCES "Ecotype" ("Id") MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT "fk_Population_DefaultSpeciesId" FOREIGN KEY ("DefaultSpeciesId") REFERENCES "Species" ("Id") MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) WITH ( OIDS=FALSE ); and function: CREATE OR REPLACE FUNCTION "Population_SelectAll"() RETURNS SETOF "Population" AS $BODY$select "Id", "Name", "Description", "IsVisible", "DefaultSpeciesId", "DefaultEcotypeId" from "Population"; $BODY$ LANGUAGE 'sql' STABLE COST 100 ROWS 1000; Calling the function after these changes results in the following error message: ERROR: could not find attribute 11 in subquery targetlist SQL state: XX000 What is causing this error and how do I fix it? I have tried to drop and recreate the columns and function - but the same error occurs. Platform is PostgreSQL 8.4 running on Windows Server. Thanks.

    Read the article

  • How to determine one to many column name from entity type

    - by snicker
    I need a way to determine the name of the column used by NHibernate to join one-to-many collections from the collected entity's type. I need to be able to determine this at runtime. Here is an example: I have some entities: namespace Entities { public class Stable { public virtual int Id {get; set;} public virtual string StableName {get; set;} public virtual IList<Pony> Ponies { get; set; } } public class Dude { public virtual int Id { get; set; } public virtual string DudesName { get; set; } public virtual IList<Pony> PoniesThatBelongToDude { get; set; } } public class Pony { public virtual int Id {get; set;} public virtual string Name {get; set;} public virtual string Color { get; set; } } } I am using NHibernate to generate the database schema, which comes out looking like this: create table "Stable" (Id integer, StableName TEXT, primary key (Id)) create table "Dude" (Id integer, DudesName TEXT, primary key (Id)) create table "Pony" (Id integer, Name TEXT, Color TEXT, Stable_id INTEGER, Dude_id INTEGER, primary key (Id)) Given that I have a Pony entity in my code, I need to be able to find out: A. Does Pony even belong to a collection in the mapping? B. If it does, what are the column names in the database table that pertain to collections In the above instance, I would like to see that Pony has two collection columns, Stable_id and Dude_id.

    Read the article

  • How do you handle the tension between refactoring and the need for merging?

    - by Xavier Nodet
    Hi, Our policy when delivering a new version is to create a branch in our VCS and handle it to our QA team. When the latter gives the green light, we tag and release our product. The branch is kept to receive (only) bug fixes so that we can create technical releases. Those bug fixes are subsequently merged on the trunk. During this time, the trunk sees the main development work, and is potentially subject to refactoring changes. The issue is that there is a tension between the need to have a stable trunk (so that the merge of bug fixes succeed -- it usually can't if the code has been e.g. extracted to another method, or moved to another class) and the need to refactor it when introducing new features. The policy in our place is to not do any refactoring before enough time has passed and the branch is stable enough. When this is the case, one can start doing refactoring changes on the trunk, and bug-fixes are to be manually committed on both the trunk and the branch. But this means that developpers must wait quite some time before committing on the trunk any refactoring change, because this could break the subsequent merge from the branch to the trunk. And having to manually port bugs from the branch to the trunk is painful. It seems to me that this hampers development... How do you handle this tension? Thanks.

    Read the article

  • How to create two columns on a web page?

    - by Roman
    I want to have two columns on my web page. For me the simples way to do that is to use a table: <table> <tr> <td> Content of the first column. </td> <td> Content of the second column. </td> </tr> </table> I like this solution because, first of all, it works (it gives exactly what I want), it is also really simple and stable (I will always have two columns, no matter how big is my window). It is easy to control the size and position of the table. However, I know that people do not like the table-layout and, as far as I know, they use div and css instead. So, I would like also to try this approach. Can anybody help me with that? I would like to have a simple solution (without tricks) that is easy to remember. It also needs to be stable (so that it will not accidentally happen that one column is under another one or they overlap or something like that).

    Read the article

  • Android - How to decide wether to run a Service in a separate Process?

    - by pableu
    I am working on an Android application that collects sensor data over the course of multiple hours. For that, we have a Service that collects the Sensor Data (e.g. Acceleration, GPS, ..), does some processing and stores them remotely on a server. Currently, this Service runs in a separate process (using android:service=":background" in the manifest). This complicates the communication between the Activities and the Service, but my predecessors created the Application this way because they thought that separating the Service from the Activities would make it more stable. I would like some more factual reasons for the effort of running a separate process. What are the advantages? Does it really run more stable? Is the Service less likely to be killed by the OS (to free up resources) if it's in a separate process? Our Application uses startForeground() and friends to minimize the chance of getting killed by the OS. The Android docs are not very specific about this, the mostly state that it depends on the Application's purpose ;-) TL;DR What are objective reasons to put a long-running Service in a separate process (in Android)?

    Read the article

  • Using VLOOKUP in Excel

    - by Mark Virtue
    VLOOKUP is one of Excel’s most useful functions, and it’s also one of the least understood.  In this article, we demystify VLOOKUP by way of a real-life example.  We’ll create a usable Invoice Template for a fictitious company. So what is VLOOKUP?  Well, of course it’s an Excel function.  This article will assume that the reader already has a passing understanding of Excel functions, and can use basic functions such as SUM, AVERAGE, and TODAY.  In its most common usage, VLOOKUP is a database function, meaning that it works with database tables – or more simply, lists of things in an Excel worksheet.  What sort of things?   Well, any sort of thing.  You may have a worksheet that contains a list of employees, or products, or customers, or CDs in your CD collection, or stars in the night sky.  It doesn’t really matter. Here’s an example of a list, or database.  In this case it’s a list of products that our fictitious company sells: Usually lists like this have some sort of unique identifier for each item in the list.  In this case, the unique identifier is in the “Item Code” column.  Note:  For the VLOOKUP function to work with a database/list, that list must have a column containing the unique identifier (or “key”, or “ID”), and that column must be the first column in the table.  Our sample database above satisfies this criterion. The hardest part of using VLOOKUP is understanding exactly what it’s for.  So let’s see if we can get that clear first: VLOOKUP retrieves information from a database/list based on a supplied instance of the unique identifier. Put another way, if you put the VLOOKUP function into a cell and pass it one of the unique identifiers from your database, it will return you one of the pieces of information associated with that unique identifier.  In the example above, you would pass VLOOKUP an item code, and it would return to you either the corresponding item’s description, its price, or its availability (its “In stock” quantity).  Which of these pieces of information will it pass you back?  Well, you get to decide this when you’re creating the formula. If all you need is one piece of information from the database, it would be a lot of trouble to go to to construct a formula with a VLOOKUP function in it.  Typically you would use this sort of functionality in a reusable spreadsheet, such as a template.  Each time someone enters a valid item code, the system would retrieve all the necessary information about the corresponding item. Let’s create an example of this:  An Invoice Template that we can reuse over and over in our fictitious company. First we start Excel… …and we create ourselves a blank invoice: This is how it’s going to work:  The person using the invoice template will fill in a series of item codes in column “A”, and the system will retrieve each item’s description and price, which will be used to calculate the line total for each item (assuming we enter a valid quantity). For the purposes of keeping this example simple, we will locate the product database on a separate sheet in the same workbook: In reality, it’s more likely that the product database would be located in a separate workbook.  It makes little difference to the VLOOKUP function, which doesn’t really care if the database is located on the same sheet, a different sheet, or a completely different workbook. In order to test the VLOOKUP formula we’re about to write, we first enter a valid item code into cell A11: Next, we move the active cell to the cell in which we want information retrieved from the database by VLOOKUP to be stored.  Interestingly, this is the step that most people get wrong.  To explain further:  We are about to create a VLOOKUP formula that will retrieve the description that corresponds to the item code in cell A11.  Where do we want this description put when we get it?  In cell B11, of course.  So that’s where we write the VLOOKUP formula – in cell B11. Select cell B11: We need to locate the list of all available functions that Excel has to offer, so that we can choose VLOOKUP and get some assistance in completing the formula.  This is found by first clicking the Formulas tab, and then clicking Insert Function:   A box appears that allows us to select any of the functions available in Excel.  To find the one we’re looking for, we could type a search term like “lookup” (because the function we’re interested in is a lookup function).  The system would return us a list of all lookup-related functions in Excel.  VLOOKUP is the second one in the list.  Select it an click OK… The Function Arguments box appears, prompting us for all the arguments (or parameters) needed in order to complete the VLOOKUP function.  You can think of this box as the function is asking us the following questions: What unique identifier are you looking up in the database? Where is the database? Which piece of information from the database, associated with the unique identifier, do you wish to have retrieved for you? The first three arguments are shown in bold, indicating that they are mandatory arguments (the VLOOKUP function is incomplete without them and will not return a valid value).  The fourth argument is not bold, meaning that it’s optional:   We will complete the arguments in order, top to bottom. The first argument we need to complete is the Lookup_value argument.  The function needs us to tell it where to find the unique identifier (the item code in this case) that it should be retuning the description of.  We must select the item code we entered earlier (in A11). Click on the selector icon to the right of the first argument: Then click once on the cell containing the item code (A11), and press Enter: The value of “A11” is inserted into the first argument. Now we need to enter a value for the Table_array argument.  In other words, we need to tell VLOOKUP where to find the database/list.  Click on the selector icon next to the second argument: Now locate the database/list and select the entire list – not including the header line.  The database is located on a separate worksheet, so we first click on that worksheet tab: Next we select the entire database, not including the header line: …and press Enter.  The range of cells that represents the database (in this case “’Product Database’!A2:D7”) is entered automatically for us into the second argument. Now we need to enter the third argument, Col_index_num.  We use this argument to specify to VLOOKUP which piece of information from the database, associate with our item code in A11, we wish to have returned to us.  In this particular example, we wish to have the item’s description returned to us.  If you look on the database worksheet, you’ll notice that the “Description” column is the second column in the database.  This means that we must enter a value of “2” into the Col_index_num box: It is important to note that that we are not entering a “2” here because the “Description” column is in the B column on that worksheet.  If the database happened to start in column K of the worksheet, we would still enter a “2” in this field. Finally, we need to decide whether to enter a value into the final VLOOKUP argument, Range_lookup.  This argument requires either a true or false value, or it should be left blank.  When using VLOOKUP with databases (as is true 90% of the time), then the way to decide what to put in this argument can be thought of as follows: If the first column of the database (the column that contains the unique identifiers) is sorted alphabetically/numerically in ascending order, then it’s possible to enter a value of true into this argument, or leave it blank. If the first column of the database is not sorted, or it’s sorted in descending order, then you must enter a value of false into this argument As the first column of our database is not sorted, we enter false into this argument: That’s it!  We’ve entered all the information required for VLOOKUP to return the value we need.  Click the OK button and notice that the description corresponding to item code “R99245” has been correctly entered into cell B11: The formula that was created for us looks like this: If we enter a different item code into cell A11, we will begin to see the power of the VLOOKUP function:  The description cell changes to match the new item code: We can perform a similar set of steps to get the item’s price returned into cell E11.  Note that the new formula must be created in cell E11.  The result will look like this: …and the formula will look like this: Note that the only difference between the two formulae is the third argument (Col_index_num) has changed from a “2” to a “3” (because we want data retrieved from the 3rd column in the database). If we decided to buy 2 of these items, we would enter a “2” into cell D11.  We would then enter a simple formula into cell F11 to get the line total: =D11*E11 …which looks like this… Completing the Invoice Template We’ve learned a lot about VLOOKUP so far.  In fact, we’ve learned all we’re going to learn in this article.  It’s important to note that VLOOKUP can be used in other circumstances besides databases.  This is less common, and may be covered in future How-To Geek articles. Our invoice template is not yet complete.  In order to complete it, we would do the following: We would remove the sample item code from cell A11 and the “2” from cell D11.  This will cause our newly created VLOOKUP formulae to display error messages: We can remedy this by judicious use of Excel’s IF() and ISBLANK() functions.  We change our formula from this…       =VLOOKUP(A11,’Product Database’!A2:D7,2,FALSE) …to this…       =IF(ISBLANK(A11),”",VLOOKUP(A11,’Product Database’!A2:D7,2,FALSE)) We would copy the formulas in cells B11, E11 and F11 down to the remainder of the item rows of the invoice.  Note that if we do this, the resulting formulas will no longer correctly refer to the database table.  We could fix this by changing the cell references for the database to absolute cell references.  Alternatively – and even better – we could create a range name for the entire product database (such as “Products”), and use this range name instead of the cell references.  The formula would change from this…       =IF(ISBLANK(A11),”",VLOOKUP(A11,’Product Database’!A2:D7,2,FALSE)) …to this…       =IF(ISBLANK(A11),”",VLOOKUP(A11,Products,2,FALSE)) …and then copy the formulas down to the rest of the invoice item rows. We would probably “lock” the cells that contain our formulae (or rather unlock the other cells), and then protect the worksheet, in order to ensure that our carefully constructed formulae are not accidentally overwritten when someone comes to fill in the invoice. We would save the file as a template, so that it could be reused by everyone in our company If we were feeling really clever, we would create a database of all our customers in another worksheet, and then use the customer ID entered in cell F5 to automatically fill in the customer’s name and address in cells B6, B7 and B8. If you would like to practice with VLOOKUP, or simply see our resulting Invoice Template, it can be downloaded from here. Similar Articles Productive Geek Tips Make Excel 2007 Print Gridlines In Workbook FileMake Excel 2007 Always Save in Excel 2003 FormatConvert Older Excel Documents to Excel 2007 FormatImport Microsoft Access Data Into ExcelChange the Default Font in Excel 2007 TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 Classic Cinema Online offers 100’s of OnDemand Movies OutSync will Sync Photos of your Friends on Facebook and Outlook Windows 7 Easter Theme YoWindoW, a real time weather screensaver Optimize your computer the Microsoft way Stormpulse provides slick, real time weather data

    Read the article

  • Beware Sneaky Reads with Unique Indexes

    - by Paul White NZ
    A few days ago, Sandra Mueller (twitter | blog) asked a question using twitter’s #sqlhelp hash tag: “Might SQL Server retrieve (out-of-row) LOB data from a table, even if the column isn’t referenced in the query?” Leaving aside trivial cases (like selecting a computed column that does reference the LOB data), one might be tempted to say that no, SQL Server does not read data you haven’t asked for.  In general, that’s quite correct; however there are cases where SQL Server might sneakily retrieve a LOB column… Example Table Here’s a T-SQL script to create that table and populate it with 1,000 rows: CREATE TABLE dbo.LOBtest ( pk INTEGER IDENTITY NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540);   WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( some_value, lob_data ) SELECT TOP (1000) N.n, @Data FROM Numbers N WHERE N.n <= 1000; Test 1: A Simple Update Let’s run a query to subtract one from every value in the some_value column: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; As you might expect, modifying this integer column in 1,000 rows doesn’t take very long, or use many resources.  The STATITICS IO and TIME output shows a total of 9 logical reads, and 25ms elapsed time.  The query plan is also very simple: Looking at the Clustered Index Scan, we can see that SQL Server only retrieves the pk and some_value columns during the scan: The pk column is needed by the Clustered Index Update operator to uniquely identify the row that is being changed.  The some_value column is used by the Compute Scalar to calculate the new value.  (In case you are wondering what the Top operator is for, it is used to enforce SET ROWCOUNT). Test 2: Simple Update with an Index Now let’s create a nonclustered index keyed on the some_value column, with lob_data as an included column: CREATE NONCLUSTERED INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); This is not a useful index for our simple update query; imagine that someone else created it for a different purpose.  Let’s run our update query again: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; We find that it now requires 4,014 logical reads and the elapsed query time has increased to around 100ms.  The extra logical reads (4 per row) are an expected consequence of maintaining the nonclustered index. The query plan is very similar to before (click to enlarge): The Clustered Index Update operator picks up the extra work of maintaining the nonclustered index. The new Compute Scalar operators detect whether the value in the some_value column has actually been changed by the update.  SQL Server may be able to skip maintaining the nonclustered index if the value hasn’t changed (see my previous post on non-updating updates for details).  Our simple query does change the value of some_data in every row, so this optimization doesn’t add any value in this specific case. The output list of columns from the Clustered Index Scan hasn’t changed from the one shown previously: SQL Server still just reads the pk and some_data columns.  Cool. Overall then, adding the nonclustered index hasn’t had any startling effects, and the LOB column data still isn’t being read from the table.  Let’s see what happens if we make the nonclustered index unique. Test 3: Simple Update with a Unique Index Here’s the script to create a new unique index, and drop the old one: CREATE UNIQUE NONCLUSTERED INDEX [UQ dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); GO DROP INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest; Remember that SQL Server only enforces uniqueness on index keys (the some_data column).  The lob_data column is simply stored at the leaf-level of the non-clustered index.  With that in mind, we might expect this change to make very little difference.  Let’s see: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; Whoa!  Now look at the elapsed time and logical reads: Scan count 1, logical reads 2016, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992.   CPU time = 172 ms, elapsed time = 16172 ms. Even with all the data and index pages in memory, the query took over 16 seconds to update just 1,000 rows, performing over 52,000 LOB logical reads (nearly 16,000 of those using read-ahead). Why on earth is SQL Server reading LOB data in a query that only updates a single integer column? The Query Plan The query plan for test 3 looks a bit more complex than before: In fact, the bottom level is exactly the same as we saw with the non-unique index.  The top level has heaps of new stuff though, which I’ll come to in a moment. You might be expecting to find that the Clustered Index Scan is now reading the lob_data column (for some reason).  After all, we need to explain where all the LOB logical reads are coming from.  Sadly, when we look at the properties of the Clustered Index Scan, we see exactly the same as before: SQL Server is still only reading the pk and some_value columns – so what’s doing the LOB reads? Updates that Sneakily Read Data We have to go as far as the Clustered Index Update operator before we see LOB data in the output list: [Expr1020] is a bit flag added by an earlier Compute Scalar.  It is set true if the some_value column has not been changed (part of the non-updating updates optimization I mentioned earlier). The Clustered Index Update operator adds two new columns: the lob_data column, and some_value_OLD.  The some_value_OLD column, as the name suggests, is the pre-update value of the some_value column.  At this point, the clustered index has already been updated with the new value, but we haven’t touched the nonclustered index yet. An interesting observation here is that the Clustered Index Update operator can read a column into the data flow as part of its update operation.  SQL Server could have read the LOB data as part of the initial Clustered Index Scan, but that would mean carrying the data through all the operations that occur prior to the Clustered Index Update.  The server knows it will have to go back to the clustered index row to update it, so it delays reading the LOB data until then.  Sneaky! Why the LOB Data Is Needed This is all very interesting (I hope), but why is SQL Server reading the LOB data?  For that matter, why does it need to pass the pre-update value of the some_value column out of the Clustered Index Update? The answer relates to the top row of the query plan for test 3.  I’ll reproduce it here for convenience: Notice that this is a wide (per-index) update plan.  SQL Server used a narrow (per-row) update plan in test 2, where the Clustered Index Update took care of maintaining the nonclustered index too.  I’ll talk more about this difference shortly. The Split/Sort/Collapse combination is an optimization, which aims to make per-index update plans more efficient.  It does this by breaking each update into a delete/insert pair, reordering the operations, removing any redundant operations, and finally applying the net effect of all the changes to the nonclustered index. Imagine we had a unique index which currently holds three rows with the values 1, 2, and 3.  If we run a query that adds 1 to each row value, we would end up with values 2, 3, and 4.  The net effect of all the changes is the same as if we simply deleted the value 1, and added a new value 4. By applying net changes, SQL Server can also avoid false unique-key violations.  If we tried to immediately update the value 1 to a 2, it would conflict with the existing value 2 (which would soon be updated to 3 of course) and the query would fail.  You might argue that SQL Server could avoid the uniqueness violation by starting with the highest value (3) and working down.  That’s fine, but it’s not possible to generalize this logic to work with every possible update query. SQL Server has to use a wide update plan if it sees any risk of false uniqueness violations.  It’s worth noting that the logic SQL Server uses to detect whether these violations are possible has definite limits.  As a result, you will often receive a wide update plan, even when you can see that no violations are possible. Another benefit of this optimization is that it includes a sort on the index key as part of its work.  Processing the index changes in index key order promotes sequential I/O against the nonclustered index. A side-effect of all this is that the net changes might include one or more inserts.  In order to insert a new row in the index, SQL Server obviously needs all the columns – the key column and the included LOB column.  This is the reason SQL Server reads the LOB data as part of the Clustered Index Update. In addition, the some_value_OLD column is required by the Split operator (it turns updates into delete/insert pairs).  In order to generate the correct index key delete operation, it needs the old key value. The irony is that in this case the Split/Sort/Collapse optimization is anything but.  Reading all that LOB data is extremely expensive, so it is sad that the current version of SQL Server has no way to avoid it. Finally, for completeness, I should mention that the Filter operator is there to filter out the non-updating updates. Beating the Set-Based Update with a Cursor One situation where SQL Server can see that false unique-key violations aren’t possible is where it can guarantee that only one row is being updated.  Armed with this knowledge, we can write a cursor (or the WHILE-loop equivalent) that updates one row at a time, and so avoids reading the LOB data: SET NOCOUNT ON; SET STATISTICS XML, IO, TIME OFF;   DECLARE @PK INTEGER, @StartTime DATETIME; SET @StartTime = GETUTCDATE();   DECLARE curUpdate CURSOR LOCAL FORWARD_ONLY KEYSET SCROLL_LOCKS FOR SELECT L.pk FROM LOBtest L ORDER BY L.pk ASC;   OPEN curUpdate;   WHILE (1 = 1) BEGIN FETCH NEXT FROM curUpdate INTO @PK;   IF @@FETCH_STATUS = -1 BREAK; IF @@FETCH_STATUS = -2 CONTINUE;   UPDATE dbo.LOBtest SET some_value = some_value - 1 WHERE CURRENT OF curUpdate; END;   CLOSE curUpdate; DEALLOCATE curUpdate;   SELECT DATEDIFF(MILLISECOND, @StartTime, GETUTCDATE()); That completes the update in 1280 milliseconds (remember test 3 took over 16 seconds!) I used the WHERE CURRENT OF syntax there and a KEYSET cursor, just for the fun of it.  One could just as well use a WHERE clause that specified the primary key value instead. Clustered Indexes A clustered index is the ultimate index with included columns: all non-key columns are included columns in a clustered index.  Let’s re-create the test table and data with an updatable primary key, and without any non-clustered indexes: IF OBJECT_ID(N'dbo.LOBtest', N'U') IS NOT NULL DROP TABLE dbo.LOBtest; GO CREATE TABLE dbo.LOBtest ( pk INTEGER NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540);   WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( pk, some_value, lob_data ) SELECT TOP (1000) N.n, N.n, @Data FROM Numbers N WHERE N.n <= 1000; Now here’s a query to modify the cluster keys: UPDATE dbo.LOBtest SET pk = pk + 1; The query plan is: As you can see, the Split/Sort/Collapse optimization is present, and we also gain an Eager Table Spool, for Halloween protection.  In addition, SQL Server now has no choice but to read the LOB data in the Clustered Index Scan: The performance is not great, as you might expect (even though there is no non-clustered index to maintain): Table 'LOBtest'. Scan count 1, logical reads 2011, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992.   Table 'Worktable'. Scan count 1, logical reads 2040, physical reads 0, read-ahead reads 0, lob logical reads 34000, lob physical reads 0, lob read-ahead reads 8000.   SQL Server Execution Times: CPU time = 483 ms, elapsed time = 17884 ms. Notice how the LOB data is read twice: once from the Clustered Index Scan, and again from the work table in tempdb used by the Eager Spool. If you try the same test with a non-unique clustered index (rather than a primary key), you’ll get a much more efficient plan that just passes the cluster key (including uniqueifier) around (no LOB data or other non-key columns): A unique non-clustered index (on a heap) works well too: Both those queries complete in a few tens of milliseconds, with no LOB reads, and just a few thousand logical reads.  (In fact the heap is rather more efficient). There are lots more fun combinations to try that I don’t have space for here. Final Thoughts The behaviour shown in this post is not limited to LOB data by any means.  If the conditions are met, any unique index that has included columns can produce similar behaviour – something to bear in mind when adding large INCLUDE columns to achieve covering queries, perhaps. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • CLSF & CLK 2013 Trip Report by Jeff Liu

    - by jamesmorris
    This is a contributed post from Jeff Liu, lead XFS developer for the Oracle mainline Linux kernel team. Recently, I attended both the China Linux Storage and Filesystem workshop (CLSF), and the China Linux Kernel conference (CLK), which were held in Shanghai. Here are the highlights for both events. CLSF - 17th October XFS update (led by Jeff Liu) XFS keeps rapid progress with a lot of changes, especially focused on the infrastructure/performance improvements as well as  new feature development.  This can be reflected with a sample statistics among XFS/Ext4+JBD2/Btrfs via: # git diff --stat --minimal -C -M v3.7..v3.12-rc4 -- fs/xfs|fs/ext4+fs/jbd2|fs/btrfs XFS: 141 files changed, 27598 insertions(+), 19113 deletions(-) Ext4+JBD2: 39 files changed, 10487 insertions(+), 5454 deletions(-) Btrfs: 70 files changed, 19875 insertions(+), 8130 deletions(-) What made up those changes in XFS? Self-describing metadata(CRC32c). This is a new feature and it contributed about 70% code changes, it can be enabled via `mkfs.xfs -m crc=1 /dev/xxx` for v5 superblock. Transaction log space reservation improvements. With this change, we can calculate the log space reservation at mount time rather than runtime to reduce the the CPU overhead. User namespace support. So both XFS and USERNS can be enabled on kernel configuration begin from Linux 3.10. Thanks Dwight Engen's efforts for this thing. Split project/group quota inodes. Originally, project quota can not be enabled with group quota at the same time because they were share the same quota file inode, now it works but only for v5 super block. i.e, CRC enabled. CONFIG_XFS_WARN, an new lightweight runtime debugger which can be deployed in production environment. Readahead log object recovery, this change can speed up the log replay progress significantly. Speculative preallocation inode tracking, clearing and throttling. The main purpose is to deal with inodes with post-EOF space due to speculative preallocation, support improved quota management to free up a significant amount of unwritten space when at or near EDQUOT. It support backgroup scanning which occurs on a longish interval(5 mins by default, tunable), and on-demand scanning/trimming via ioctl(2). Bitter arguments ensued from this session, especially for the comparison between Ext4 and Btrfs in different areas, I have to spent a whole morning of the 1st day answering those questions. We basically agreed on XFS is the best choice in Linux nowadays because: Stable, XFS has a good record in stability in the past 10 years. Fengguang Wu who lead the 0-day kernel test project also said that he has observed less error than other filesystems in the past 1+ years, I own it to the XFS upstream code reviewer, they always performing serious code review as well as testing. Good performance for large/small files, XFS does not works very well for small files has already been an old story for years. Best choice (maybe) for distributed PB filesystems. e.g, Ceph recommends delopy OSD daemon on XFS because Ext4 has limited xattr size. Best choice for large storage (>16TB). Ext4 does not support a single file more than around 15.95TB. Scalability, any objection to XFS is best in this point? :) XFS is better to deal with transaction concurrency than Ext4, why? The maximum size of the log in XFS is 2038MB compare to 128MB in Ext4. Misc. Ext4 is widely used and it has been proved fast/stable in various loads and scenarios, XFS just need more customers, and Btrfs is still on the road to be a manhood. Ceph Introduction (Led by Li Wang) This a hot topic.  Li gave us a nice introduction about the design as well as their current works. Actually, Ceph client has been included in Linux kernel since 2.6.34 and supported by Openstack since Folsom but it seems that it has not yet been widely deployment in production environment. Their major work is focus on the inline data support to separate the metadata and data storage, reduce the file access time, i.e, a file access need communication twice, fetch the metadata from MDS and then get data from OSD, and also, the small file access is limited by the network latency. The solution is, for the small files they would like to store the data at metadata so that when accessing a small file, the metadata server can push both metadata and data to the client at the same time. In this way, they can reduce the overhead of calculating the data offset and save the communication to OSD. For this feature, they have only run some small scale testing but really saw noticeable improvements. Test environment: Intel 2 CPU 12 Core, 64GB RAM, Ubuntu 12.04, Ceph 0.56.6 with 200GB SATA disk, 15 OSD, 1 MDS, 1 MON. The sequence read performance for 1K size files improved about 50%. I have asked Li and Zheng Yan (the core developer of Ceph, who also worked on Btrfs) whether Ceph is really stable and can be deployed at production environment for large scale PB level storage, but they can not give a positive answer, looks Ceph even does not spread over Dreamhost (subject to confirmation). From Li, they only deployed Ceph for a small scale storage(32 nodes) although they'd like to try 6000 nodes in the future. Improve Linux swap for Flash storage (led by Shaohua Li) Because of high density, low power and low price, flash storage (SSD) is a good candidate to partially replace DRAM. A quick answer for this is using SSD as swap. But Linux swap is designed for slow hard disk storage, so there are a lot of challenges to efficiently use SSD for swap. SWAPOUT swap_map scan swap_map is the in-memory data structure to track swap disk usage, but it is a slow linear scan. It will become a bottleneck while finding many adjacent pages in the use of SSD. Shaohua Li have changed it to a cluster(128K) list, resulting in O(1) algorithm. However, this apporoach needs restrictive cluster alignment and only enabled for SSD. IO pattern In most cases, the swap io is in interleaved pattern because of mutiple reclaimers or a free cluster is shared by all reclaimers. Even though block layer can merge interleaved IO to some extent, but we cannot count on it completely. Hence the per-cpu cluster is added base on the previous change, it can help reclaimer do sequential IO and the block layer will be easier to merge IO. TLB flush: If we're reclaiming one active page, we should first move the page from active lru list to inactive lru list, and then reclaim the page from inactive lru to swap it out. During the process, we need to clear PTE twice: first is 'A'(ACCESS) bit, second is 'P'(PRESENT) bit. Processors need to send lots of ipi which make the TLB flush really expensive. Some works have been done to improve this, including rework smp_call_functiom_many() or remove the first TLB flush in x86, but there still have some arguments here and only parts of works have been pushed to mainline. SWAPIN: Page fault does iodepth=1 sync io, but it's a little waste if only issue a page size's IO. The obvious solution is doing swap readahead. But the current in-kernel swap readahead is arbitary(always 8 pages), and it always doesn't perform well for both random and sequential access workload. Shaohua introduced a new flag for madvise(MADV_WILLNEED) to do swap prefetch, so the changes happen in userspace API and leave the in-kernel readahead unchanged(but I think some improvement can also be done here). SWAP discard As we know, discard is important for SSD write throughout, but the current swap discard implementation is synchronous. He changed it to async discard which allow discard and write run in the same time. Meanwhile, the unit of discard is also optimized to cluster. Misc: lock contention For many concurrent swapout and swapin , the lock contention such as anon_vma or swap_lock is high, so he changed the swap_lock to a per-swap lock. But there still have some lock contention in very high speed SSD because of swapcache address_space lock. Zproject (led by Bob Liu) Bob gave us a very nice introduction about the current memory compression status. Now there are 3 projects(zswap/zram/zcache) which all aim at smooth swap IO storm and promote performance, but they all have their own pros and cons. ZSWAP It is implemented based on frontswap API and it uses a dynamic allocater named Zbud to allocate free pages. Zbud means pairs of zpages are "buddied" and it can only store at most two compressed pages in one page frame, so the max compress ratio is 50%. Each page frame is lru-linked and can do shink in memory pressure. If the compressed memory pool reach its limitation, shink or reclaim happens. It decompress the page frame into two new allocated pages and then write them to real swap device, but it can fail when allocating the two pages. ZRAM Acts as a compressed ramdisk and used as swap device, and it use zsmalloc as its allocator which has high density but may have fragmentation issues. Besides, page reclaim is hard since it will need more pages to uncompress and free just one page. ZRAM is preferred by embedded system which may not have any real swap device. Now both ZRAM and ZSWAP are in driver/staging tree, and in the mm community there are some disscussions of merging ZRAM into ZSWAP or viceversa, but no agreement yet. ZCACHE Handles file page compression but it is removed out of staging recently. From industry (led by Tang Jie, LSI) An LSI engineer introduced several new produces to us. The first is raid5/6 cards that it use full stripe writes to improve performance. The 2nd one he introduced is SandForce flash controller, who can understand data file types (data entropy) to reduce write amplification (WA) for nearly all writes. It's called DuraWrite and typical WA is 0.5. What's more, if enable its Dynamic Logical Capacity function module, the controller can do data compression which is transparent to upper layer. LSI testing shows that with this virtual capacity enables 1x TB drive can support up to 2x TB capacity, but the application must monitor free flash space to maintain optimal performance and to guard against free flash space exhaustion. He said the most useful application is for datebase. Another thing I think it's worth to mention is that a NV-DRAM memory in NMR/Raptor which is directly exposed to host system. Applications can directly access the NV-DRAM via a memory address - using standard system call mmap(). He said that it is very useful for database logging now. This kind of NVM produces are beginning to appear in recent years, and it is said that Samsung is building a research center in China for related produces. IMHO, NVM will bring an effect to current os layer especially on file system, e.g. its journaling may need to redesign to fully utilize these nonvolatile memory. OCFS2 (led by Canquan Shen) Without a doubt, HuaWei is the biggest contributor to OCFS2 in the past two years. They have posted 46 upstream patches and 39 patches have been merged. Their current project is based on 32/64 nodes cluster, but they also tried 128 nodes at the experimental stage. The major work they are working is to support ATS (atomic test and set), it can be works with DLM at the same time. Looks this idea is inspired by the vmware VMFS locking, i.e, http://blogs.vmware.com/vsphere/2012/05/vmfs-locking-uncovered.html CLK - 18th October 2013 Improving Linux Development with Better Tools (Andi Kleen) This talk focused on how to find/solve bugs along with the Linux complexity growing. Generally, we can do this with the following kind of tools: Static code checkers tools. e.g, sparse, smatch, coccinelle, clang checker, checkpatch, gcc -W/LTO, stanse. This can help check a lot of things, simple mistakes, complex problems, but the challenges are: some are very slow, false positives, may need a concentrated effort to get false positives down. Especially, no static checker I found can follow indirect calls (“OO in C”, common in kernel): struct foo_ops { int (*do_foo)(struct foo *obj); } foo->do_foo(foo); Dynamic runtime checkers, e.g, thread checkers, kmemcheck, lockdep. Ideally all kernel code would come with a test suite, then someone could run all the dynamic checkers. Fuzzers/test suites. e.g, Trinity is a great tool, it finds many bugs, but needs manual model for each syscall. Modern fuzzers around using automatic feedback, but notfor kernel yet: http://taviso.decsystem.org/making_software_dumber.pdf Debuggers/Tracers to understand code, e.g, ftrace, can dump on events/oops/custom triggers, but still too much overhead in many cases to run always during debug. Tools to read/understand source, e.g, grep/cscope work great for many cases, but do not understand indirect pointers (OO in C model used in kernel), give us all “do_foo” instances: struct foo_ops { int (*do_foo)(struct foo *obj); } = { .do_foo = my_foo }; foo>do_foo(foo); That would be great to have a cscope like tool that understands this based on types/initializers XFS: The High Performance Enterprise File System (Jeff Liu) [slides] I gave a talk for introducing the disk layout, unique features, as well as the recent changes.   The slides include some charts to reflect the performances between XFS/Btrfs/Ext4 for small files. About a dozen users raised their hands when I asking who has experienced with XFS. I remembered that when I asked the same question in LinuxCon/Japan, only 3 people raised their hands, but they are Chris Mason, Ric Wheeler, and another attendee. The attendee questions were mainly focused on stability, and comparison with other file systems. Linux Containers (Feng Gao) The speaker introduced us that the purpose for those kind of namespaces, include mount/UTS/IPC/Network/Pid/User, as well as the system API/ABI. For the userspace tools, He mainly focus on the Libvirt LXC rather than us(LXC). Libvirt LXC is another userspace container management tool, implemented as one type of libvirt driver, it can manage containers, create namespace, create private filesystem layout for container, Create devices for container and setup resources controller via cgroup. In this talk, Feng also mentioned another two possible new namespaces in the future, the 1st is the audit, but not sure if it should be assigned to user namespace or not. Another is about syslog, but the question is do we really need it? In-memory Compression (Bob Liu) Same as CLSF, a nice introduction that I have already mentioned above. Misc There were some other talks related to ACPI based memory hotplug, smart wake-affinity in scheduler etc., but my head is not big enough to record all those things. -- Jeff Liu

    Read the article

  • CodePlex Daily Summary for Tuesday, May 25, 2010

    CodePlex Daily Summary for Tuesday, May 25, 2010New ProjectsBibleNames: BibleNames BibleNames BibleNames BibleNames BibleNamesBing Search for PHP Developers: This is the Bing SDK for PHP, a toolkit that allows you to easily use Bing's API to fetch search results and use in your own application. The Bin...Fading Clock: Fading Clock is a normal clock that has the ability to fade in out. It's developed in C# as a Windows App in .net 2.0.Fuzzy string matching algorithm in C# and LINQ: Fuzzy matching strings. Find out how similar two string is, and find the best fuzzy matching string from a string table. Given a string (strA) and...HexTile editor: Testing hexagonal tile editorhgcheck12: hgcheck12Metaverse Router: Metaverse Router makes it easier for MIIS, ILM, FIM Sync engine administrators to manage multiple provisioning modules, turn on/off provisioning wi...MyVocabulary: Use MyVocabulary to structure and test the words you want to learn in a foreign language. This is a .net 3.5 windows forms application developed in...phpxw: Phpxw 是一个简易的PHP框架。以我自己的姓名命名的。 Phpxw is a simple PHP framework. Take my name named.Plop: Social networking wrappers and projects.PST Data Structure View Tool: PST Data Structure View Tool (PSTViewTool) is a tool supporting the PST file format documentation effort. It allows the user to browse the internal...PST File Format SDK: PST File Format SDK (pstsdk) is a cross platform header only C++ library for reading PST files.QWine: QWine is Queue Machine ApplicationSharePoint Cross Site Collection Security Trimmed Navigation: This SP2010 project will show security trimmed navigation that works across site collections. The project is written for SP2010, but can be easily ...SharePoint List Field Manager: The SharePoint List Field Manager allows users to manage the Boolean properties of a list such as Read Only, Hidden, Show in New Form etc... It sup...Silverlight Toolbar for DataGrid working with RIA Services or local data: DataGridToolbar contains two controls for Silverlight DataGrid designed for RIA Services and local data. It can be used to filter or remove a data,...SilverShader - Silverlight Pixel Shader Demos: SilverShader is an extensible Silverlight application that is used to demonstrate the effect of different pixel shaders. The shaders can be applied...SNCFT Gadget: Ce gadget permet de consulter les horaires des trains et de chercher des informations sur le site de la société nationale des chemins de fer tunisi...Software Transaction Memory: Software Transaction Memory for .NETStreamInsight Samples: This project contains sample code for StreamInsight, Microsoft's platform for complex event processing. The purpose of the samples is to provide a ...StyleAnalizer: A CSS parserSudoku (Multiplayer in RnD): Sudoku project was to practice on C# by making a desktop application using some algorithm Before this, I had worked on http://shaktisaran.tech.o...Tiplican: A small website built for the purpose of learning .Net 4 and MVC 2TPager: Mercurial pager with color support on Windowsunirca: UNIRCA projectWcfTrace: The WcfTrace is a easy way to collect information about WCF-service call order, processing time and etc. It's developed in C#.New ReleasesASP.NET TimePicker Control: 12 24 Hour Bug Fix: 12 24 Hour Bug FixASP.NET TimePicker Control: ASP.NET TimePicker Control: This release fixes a focus bug that manifests itself when switching focus between two different timepicker controls on the same page, and make chan...ASP.NET TimePicker Control: ASP.NET TimePicker Control - colon CSS Fix: Fixes ":" seperator placement issues being too hi and too low in IE and FireFox, respectively.ASP.NET TimePicker Control: Release fixes 24 Hour Mode Bug: Release fixes 24 Hour Mode BugBFBC2 PRoCon: PRoCon 0.5.1.1: Visit http://phogue.net/?p=604 for release notes.BFBC2 PRoCon: PRoCon 0.5.1.2: Release notes can be found at http://phogue.net/?p=604BFBC2 PRoCon: PRoCon 0.5.1.4: Ha.. choosing the "stable" option at the moment is a bit of a joke =\ Release notes at http://phogue.net/?p=604BFBC2 PRoCon: PRoCon 0.5.1.5: BWHAHAHA stable.. ha. Actually this ones looking pretty good now. http://phogue.net/?p=604Bojinx: Bojinx Core V4.5.14: Issues fixed in this release: Fixed an issue that caused referencePropertyName when used through a property configuration in the context to not wo...Bojinx: Bojinx Debugger V0.9B: Output trace and filtering that works with the Bojinx logger.CassiniDev - Cassini 3.5/4.0 Developers Edition: CassiniDev 3.5.1.5 and 4.0.1.5 beta3: Fixed fairly serious bug #13290 http://cassinidev.codeplex.com/WorkItem/View.aspx?WorkItemId=13290Content Rendering: Content Rendering API 1.0.0 Revision 46406: Initial releaseDeploy Workflow Manager: Deploy Workflow Manager Web Part v2: Recommend you test in your development environment first BEFORE using in production.dotSpatial: System.Spatial.Projections Zip May 24, 2010: Adds a new spherical projection.eComic: eComic 2010.0.0.2: Quick release to fix a couple of bugs found in the previous version. Version 2010.0.0.2 Fixed crash error when accessing the "Go To Page" dialog ...Exchange 2010 RBAC Editor (RBAC GUI) - updated on 5/24/2010: RBAC Editor 0.9.4.1: Some bugs fixed; support for unscopoedtoplevel (adding script is not working yet) Please use email address in About menu of the tool for any feedb...Facebook Graph Toolkit: Preview 1 Binaries: The first preview release of the toolkit. Not recommended for use in production, but enought to get started developing your app.Fading Clock: Clock.zip: Clock.zip is a zip file that contains the application Clock.exe.hgcheck12: Rel8082: Rel8082hgcheck12: scsc: scasMetaverse Router: Metaverse Router v1.0: Initial stable release (v.1.0.0.0) of Metaverse Router Working with: FIM 2010 Synchronization Engine ILM 2007 MIIS 2003MSTestGlider: MSTestGlider 1.5: What MSTestGlider 1.5.zip unzips to: http://i49.tinypic.com/2lcv4eg.png If you compile MSTestGlider from its source you will need to change the ou...MyVocabulary: Version 1.0: First releaseNLog - Advanced .NET Logging: Nightly Build 2010.05.24.001: Changes since the last build:2010-05-23 20:45:37 Jarek Kowalski fixed whitespace in NLog.wxs 2010-05-23 12:01:48 Jarek Kowalski BufferingTargetWra...NoteExpress User Tools (NEUT) - Do it by ourselves!: NoteExpress User Tools 2.0.0: 测试版本:NoteExpress 2.5.0.1154 +调整了Tab页的排列方式 注:2.0未做大的改动,仅仅是运行环境由原来的.net 3.5升级到4.0。openrs: Beta Release (Revision 1): This is the beta release of the openrs framework. Please make sure you submit issues in the issue tracker tab. As this is beta, extreme, flawless ...openrs: Revision 2: Revision 2 of the framework. Basic worker example as well as minor improvements on the auth packet.phpxw: Phpxw: Phpxw 1.0 phpxw 是一个简易的PHP框架。以我自己的姓名命名的。 Phpxw is a simple PHP framework. Take my name named. 支持基本的业务逻辑流程,功能模块化,实现了简易的模板技术,同时也可以支持外接模板引擎。 Support...sELedit: sELedit v1.1a: Fixed: clean file before overwriting Fixed: list57 will only created when eLC.Lists.length > 57sGSHOPedit: sGSHOPedit v1.0a: Fixed: bug with wrong item array re-size when adding items after deleting items Added: link to project page pwdatabase.com version is now selec...SharePoint Cross Site Collection Security Trimmed Navigation: Release 1.0.0.0: If you want just the .wsp, and start using this, you can grab it here. Just stsadm add/deploy to your website, and activate the feature as describ...Silverlight 4.0 Popup Menu: Context Menu for Silverlight 4.0 v1.24 Beta: - Updated the demo and added clipboard cut/copy and paste functionality. - Added delay on hover events for both parent and child menus. - Parent me...Silverlight Toolbar for DataGrid working with RIA Services or local data: DataGridToolBar Beta: For Silverlight 4.0sMAPtool: sMAPtool v0.7d (without Maps): Added: link to project pagesMODfix: sMODfix v1.0a: Added: Support for ECM v52 modelssNPCedit: sNPCedit v0.9a: browse source commits for all changes...SocialScapes: SocialScapes TwitterWidget 1.0.0: The SocialScapes TwitterWidget is a DotNetNuke Widget for displaying Twitter searches. This widget will be used to replace the twitter functionali...SQL Server 2005 and 2008 - Backup, Integrity Check and Index Optimization: 23 May 2010: This is the latest version of my solution for Backup, Integrity Check and Index Optimization in SQL Server 2005, SQL Server 2008 and SQL Server 200...sqwarea: Sqwarea 0.0.280.0 (alpha): This release brings a lot of improvements. We strongly recommend you to upgrade to this version.sTASKedit: sTASKedit v0.7c: Minor Changes in GUI & BehaviourSudoku (Multiplayer in RnD): Sudoku (Multiplayer in RnD) 1.0.0.0 source: Sudoku project was to practice on C# by making a desktop application using some algorithm Idea: The basic idea of algorithm is from http://www.ac...Sudoku (Multiplayer in RnD): Sudoku (Multiplayer in RnD) 1.0.0.1 source: Worked on user-interface, would improve it Sudoku project was to practice on C# by making a desktop application using some algorithm Idea: The b...TFS WorkItem Watcher: TFS WorkItem Watcher Version 1.0: This version contains the following new features: Added support to autodetect whether to start as a service or to start in console mode. The "-c" ...TfsPolicyPack: TfsPolicyPack 0.1: This is the first release of the TfsPolicyPack. This release includes the following policies: CustomRegexPathPolicythinktecture Starter STS (Community Edition): StarterSTS v1.1 CTP: Added ActAs / identity delegation support.TPager: TPager-20100524: TPager 2010-05-24 releaseTrance Layer: TranceLayer Transformer: Transformer is a Beta version 2, morphing from "Digger" to "Transformer" release cycle. It is intended to be used as a demonstration of muscles wh...TweetSharp: TweetSharp v1.0.0.0: Changes in v1.0.0Added 100% public code comments Bug fixes based on feedback from the Release Candidate Changes to handle Twitter schema additi...VCC: Latest build, v2.1.30524.0: Automatic drop of latest buildWCF Client Generator: Version 0.9.2.33468: Version 0.9.2.33468 Fixed: Nested enum types names are not handled correctly. Can't close Visual Studio if generated files are open when the code...Word 2007 Redaction Tool: Version 1.2: A minor update to the Word 2007 Redaction Tool. This version can be installed directly over any existing version. Updates to Version 1.2Fixed bugs:...xPollinate - Windows Live Writer Cross Post Plugin: 1.0.0.5 for WLW 14.0.8117.416: This version works with WLW 14.0.8117.416. This release includes a fix to enable publishing posts that have been opened directly from a blog, but ...Yet another developer blog - Examples: jQuery Autocomplete in ASP.NET MVC: This sample application shows how to use jQuery Autocomplete plugin in ASP.NET MVC. This application is accompanied by the following entry on Yet a...Most Popular ProjectsRawrWBFS ManagerAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)patterns & practices – Enterprise LibraryMicrosoft SQL Server Community & SamplesPHPExcelASP.NETMost Active Projectspatterns & practices – Enterprise LibraryRawrpatterns & practices: Windows Azure Security GuidanceSqlServerExtensionsGMap.NET - Great Maps for Windows Forms & PresentationMono.AddinsCaliburn: An Application Framework for WPF and SilverlightBlogEngine.NETIonics Isapi Rewrite FilterSQL Server PowerShell Extensions

    Read the article

  • Auto blocking attacking IP address

    - by dong
    This is to share my PowerShell code online. I original asked this question on MSDN forum (or TechNet?) here: http://social.technet.microsoft.com/Forums/en-US/winserversecurity/thread/f950686e-e3f8-4cf2-b8ec-2685c1ed7a77 In short, this is trying to find attacking IP address then add it into Firewall block rule. So I suppose: 1, You are running a Windows Server 2008 facing the Internet. 2, You need to have some port open for service, e.g. TCP 21 for FTP; TCP 3389 for Remote Desktop. You can see in my code I’m only dealing with these two since that’s what I opened. You can add further port number if you like, but the way to process might be different with these two. 3, I strongly suggest you use STRONG password and follow all security best practices, this ps1 code is NOT for adding security to your server, but reduce the nuisance from brute force attack, and make sys admin’s life easier: i.e. your FTP log won’t hold megabytes of nonsense, your Windows system log will not roll back and only can tell you what happened last month. 4, You are comfortable with setting up Windows Firewall rules, in my code, my rule has a name of “MY BLACKLIST”, you need to setup a similar one, and set it to BLOCK everything. 5, My rule is dangerous because it has the risk to block myself out as well. I do have a backup plan i.e. the DELL DRAC5 so that if that happens, I still can remote console to my server and reset the firewall. 6, By no means the code is perfect, the coding style, the use of PowerShell skills, the hard coded part, all can be improved, it’s just that it’s good enough for me already. It has been running on my server for more than 7 MONTHS. 7, Current code still has problem, I didn’t solve it yet, further on this point after the code. :)    #Dong Xie, March 2012  #my simple code to monitor attack and deal with it  #Windows Server 2008 Logon Type  #8: NetworkCleartext, i.e. FTP  #10: RemoteInteractive, i.e. RDP    $tick = 0;  "Start to run at: " + (get-date);    $regex1 = [regex] "192\.168\.100\.(?:101|102):3389\s+(\d+\.\d+\.\d+\.\d+)";  $regex2 = [regex] "Source Network Address:\t(\d+\.\d+\.\d+\.\d+)";    while($True) {   $blacklist = @();     "Running... (tick:" + $tick + ")"; $tick+=1;    #Port 3389  $a = @()  netstat -no | Select-String ":3389" | ? { $m = $regex1.Match($_); `    $ip = $m.Groups[1].Value; if ($m.Success -and $ip -ne "10.0.0.1") {$a = $a + $ip;} }  if ($a.count -gt 0) {    $ips = get-eventlog Security -Newest 1000 | Where-Object {$_.EventID -eq 4625 -and $_.Message -match "Logon Type:\s+10"} | foreach { `      $m = $regex2.Match($_.Message); $ip = $m.Groups[1].Value; $ip; } | Sort-Object | Tee-Object -Variable list | Get-Unique    foreach ($ip in $a) { if ($ips -contains $ip) {      if (-not ($blacklist -contains $ip)) {        $attack_count = ($list | Select-String $ip -SimpleMatch | Measure-Object).count;        "Found attacking IP on 3389: " + $ip + ", with count: " + $attack_count;        if ($attack_count -ge 20) {$blacklist = $blacklist + $ip;}      }      }    }  }      #FTP  $now = (Get-Date).AddMinutes(-5); #check only last 5 mins.     #Get-EventLog has built-in switch for EventID, Message, Time, etc. but using any of these it will be VERY slow.  $count = (Get-EventLog Security -Newest 1000 | Where-Object {$_.EventID -eq 4625 -and $_.Message -match "Logon Type:\s+8" -and `              $_.TimeGenerated.CompareTo($now) -gt 0} | Measure-Object).count;  if ($count -gt 50) #threshold  {     $ips = @();     $ips1 = dir "C:\inetpub\logs\LogFiles\FPTSVC2" | Sort-Object -Property LastWriteTime -Descending `       | select -First 1 | gc | select -Last 200 | where {$_ -match "An\+error\+occured\+during\+the\+authentication\+process."} `        | Select-String -Pattern "(\d+\.\d+\.\d+\.\d+)" | select -ExpandProperty Matches | select -ExpandProperty value | Group-Object `        | where {$_.Count -ge 10} | select -ExpandProperty Name;       $ips2 = dir "C:\inetpub\logs\LogFiles\FTPSVC3" | Sort-Object -Property LastWriteTime -Descending `       | select -First 1 | gc | select -Last 200 | where {$_ -match "An\+error\+occured\+during\+the\+authentication\+process."} `        | Select-String -Pattern "(\d+\.\d+\.\d+\.\d+)" | select -ExpandProperty Matches | select -ExpandProperty value | Group-Object `        | where {$_.Count -ge 10} | select -ExpandProperty Name;     $ips += $ips1; $ips += $ips2; $ips = $ips | where {$_ -ne "10.0.0.1"} | Sort-Object | Get-Unique;         foreach ($ip in $ips) {       if (-not ($blacklist -contains $ip)) {        "Found attacking IP on FTP: " + $ip;        $blacklist = $blacklist + $ip;       }     }  }        #Firewall change <# $current = (netsh advfirewall firewall show rule name="MY BLACKLIST" | where {$_ -match "RemoteIP"}).replace("RemoteIP:", "").replace(" ","").replace("/255.255.255.255",""); #inside $current there is no \r or \n need remove. foreach ($ip in $blacklist) { if (-not ($current -match $ip) -and -not ($ip -like "10.0.0.*")) {"Adding this IP into firewall blocklist: " + $ip; $c= 'netsh advfirewall firewall set rule name="MY BLACKLIST" new RemoteIP="{0},{1}"' -f $ip, $current; Invoke-Expression $c; } } #>    foreach ($ip in $blacklist) {    $fw=New-object –comObject HNetCfg.FwPolicy2; # http://blogs.technet.com/b/jamesone/archive/2009/02/18/how-to-manage-the-windows-firewall-settings-with-powershell.aspx    $myrule = $fw.Rules | where {$_.Name -eq "MY BLACKLIST"} | select -First 1; # Potential bug here?    if (-not ($myrule.RemoteAddresses -match $ip) -and -not ($ip -like "10.0.0.*"))      {"Adding this IP into firewall blocklist: " + $ip;         $myrule.RemoteAddresses+=(","+$ip);      }  }    Wait-Event -Timeout 30 #pause 30 secs    } # end of top while loop.   Further points: 1, I suppose the server is listening on port 3389 on server IP: 192.168.100.101 and 192.168.100.102, you need to replace that with your real IP. 2, I suppose you are Remote Desktop to this server from a workstation with IP: 10.0.0.1. Please replace as well. 3, The threshold for 3389 attack is 20, you don’t want to block yourself just because you typed your password wrong 3 times, you can change this threshold by your own reasoning. 4, FTP is checking the log for attack only to the last 5 mins, you can change that as well. 5, I suppose the server is serving FTP on both IP address and their LOG path are C:\inetpub\logs\LogFiles\FPTSVC2 and C:\inetpub\logs\LogFiles\FPTSVC3. Change accordingly. 6, FTP checking code is only asking for the last 200 lines of log, and the threshold is 10, change as you wish. 7, the code runs in a loop, you can set the loop time at the last line. To run this code, copy and paste to your editor, finish all the editing, get it to your server, and open an CMD window, then type powershell.exe –file your_powershell_file_name.ps1, it will start running, you can Ctrl-C to break it. This is what you see when it’s running: This is when it detected attack and adding the firewall rule: Regarding the design of the code: 1, There are many ways you can detect the attack, but to add an IP into a block rule is no small thing, you need to think hard before doing it, reason for that may include: You don’t want block yourself; and not blocking your customer/user, i.e. the good guy. 2, Thus for each service/port, I double check. For 3389, first it needs to show in netstat.exe, then the Event log; for FTP, first check the Event log, then the FTP log files. 3, At three places I need to make sure I’m not adding myself into the block rule. –ne with single IP, –like with subnet.   Now the final bit: 1, The code will stop working after a while (depends on how busy you are attacked, could be weeks, months, or days?!) It will throw Red error message in CMD, don’t Panic, it does no harm, but it also no longer blocking new attack. THE REASON is not confirmed with MS people: the COM object to manage firewall, you can only give it a list of IP addresses to the length of around 32KB I think, once it reaches the limit, you get the error message. 2, This is in fact my second solution to use the COM object, the first solution is still in the comment block for your reference, which is using netsh, that fails because being run from CMD, you can only throw it a list of IP to 8KB. 3, I haven’t worked the workaround yet, some ideas include: wrap that RemoteAddresses setting line with error checking and once it reaches the limit, use the newly detected IP to be the list, not appending to it. This basically reset your block rule to ground zero and lose the previous bad IPs. This does no harm as it sounds, because given a certain period has passed, any these bad IPs still not repent and continue the attack to you, it only got 30 seconds or 20 guesses of your password before you block it again. And there is the benefit that the bad IP may turn back to the good hands again, and you are not blocking a potential customer or your CEO’s home pc because once upon a time, it’s a zombie. Thus the ZEN of blocking: never block any IP for too long. 4, But if you insist to block the ugly forever, my other ideas include: You call MS support, ask them how can we set an arbitrary length of IP addresses in a rule; at least from my experiences at the Forum, they don’t know and they don’t care, because they think the dynamic blocking should be done by some expensive hardware. Or, from programming perspective, you can create a new rule once the old is full, then you’ll have MY BLACKLIST1, MY  BLACKLIST2, MY BLACKLIST3, … etc. Once in a while you can compile them together and start a business to sell your blacklist on the market! Enjoy the code! p.s. (PowerShell is REALLY REALLY GREAT!)

    Read the article

< Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >