Search Results

Search found 9426 results on 378 pages for 'monkey sort'.

Page 68/378 | < Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75  | Next Page >

  • Logical Domain Modeling Made Simple

    - by Knut Vatsendvik
    How can logical domain modeling be made simple and collaborative? Many non-technical end-users, managers and business domain experts find it difficult to understand the visual models offered by many UML tools. This creates trouble in capturing and verifying the information that goes into a logical domain model. The tools are also too advanced and complex for a non-technical user to learn and use. We have therefore, in our current project, ended up with using Confluence as tool for designing the logical domain model with the help of a few very useful plugins. Big thanks to Ole Nymoen and Per Spilling for their expertise in this field that made this posting possible. Confluence Plugins Here is a list of Confluence plugins used in this solution. Install these before trying out the macros used below. Plugin Description Copy Space Allows a space administrator to copy a space, including the pages within the space Metadata Supports adding metadata to Wiki pages Label Manages labeling of pages Linking Contains macros for linking to templates, the dashboard and other Table Enhances the table capability in Confluence Creating a Confluence Space First we need to create a new confluence space for the domain model. Click the link Create a Space located below the list of spaces on the Dashboard. Please contact your Confluence administrator is you do not have permissions to do this.   For illustrative purpose all attributes and entities in this posting are based on my imaginary project manager domain model. When a logical domain model is good enough for being implemented, do a copy of the Confluence Space (see Copy Space plugin). In this way you create a stable version of the logical domain model while further design can continue with the new copied space. Typical will the implementation phase result in a database design and/or a XSD schema design. Add Space Templates Go to the Home page of your Confluence Space. Navigate to the Browse drop-down menu and click on Advanced. Then click the Templates option in the left navigation panel. Click Add New Space Template to add the following three templates. Name: attribute {metadata-list} || Name | | || Type | | || Format | | || Description | | {metadata-list} {add-label:attribute} Name: primary-type {metadata-list} || Name | || || Type | || || Format | || || Description | || {metadata-list} {add-label:primary-type} Name: complex-type {metadata-list} || Name | || || Description |  || {metadata-list} h3. Attributes || Name || Type || Format || Description || | [name] | {metadata-from:name|Type} | {metadata-from:name|Format} | {metadata-from:name|Description} | {add-label:complex-type,entity} The metadata-list macro (see Metadata plugin) will save a list of metadata values to the page. The add-label macro (see Label plugin) will automatically label the page. Primary Types Page Our first page to add will act as container for our primary types. Switch to Wiki markup when adding the following content to the page. | (+) {add-page:template=primary-type|parent=@self}Add new primary type{add-page} | {metadata-report:Name,Type,Format,Description|sort=Name|root=@self|pages=@descendents} Once the page is created, click the Add new primary type (create-page macro) to start creating a new pages. Here is an example of input to the LocalDate page. Embrace the LocalDate with square brackets [] to make the page linkable. Again switch to Wiki markup before editing. {metadata-list} || Name | [LocalDate] || || Type | Date || || Format | YYYY-MM-DD || || Description | Date in local time zone. YYYY = year, MM = month and DD = day || {metadata-list} {add-label:primary-type} The metadata-report macro will show a tabular report of all child pages.   Attributes Page The next page will act as container for all of our attributes. | (+) {add-page:template=attribute|parent=@self|title=attribute}Add new attribute{add-page} | {metadata-report:Name,Type,Format,Description|sort=Name|pages=@descendants} Here is an example of input to the startDate page. {metadata-list} || Name | [startDate] || || Type | [LocalDate] || || Format | {metadata-from:LocalDate|Format} || || Description | The projects start date || {metadata-list} {add-label:attribute} Using the metadata-from macro we fetch the text from the previously created LocalDate page. Complex Types Page The last page in this example shows how attributes can be combined together to form more complex types.   h3. Intro Overview of complex types in the domain model. | (+) {add-page:template=complex-type|parent=@self}Add a new complex type{add-page}\\ | {metadata-report:Name,Description|sort=Name|root=@self|pages=@descendents} Here is an example of input to the ProjectType page. {metadata-list} || Name | [ProjectType] || || Description | Represents a project || {metadata-list} h3. Attributes || Name || Type || Format || Description || | [projectId] | {metadata-from:projectId|Type} | {metadata-from:projectId|Format} | {metadata-from:projectId|Description} | | [name] | {metadata-from:name|Type} | {metadata-from:name|Format} | {metadata-from:name|Description} | | [description] | {metadata-from:description|Type} | {metadata-from:description|Format} | {metadata-from:description|Description} | | [startDate] | {metadata-from:startDate|Type} | {metadata-from:startDate|Format} | {metadata-from:startDate|Description} | {add-label:complex-type,entity} Gives us this Conclusion Using a web-based corporate Wiki like Confluence to create a logical domain model increases the collaboration between people with different roles in the enterprise. It’s my believe that this helps the domain model to be more accurate, and better documented. In our real project we have more pages than illustrated here to complete the documentation. We do also still use UML tools to create different types of diagrams that Confluence do not support. As a last tip, an ImageMap plugin can make those diagrams clickable when used in pages. Enjoy!

    Read the article

  • Talend Enterprise Data Integration overperforms on Oracle SPARC T4

    - by Amir Javanshir
    The SPARC T microprocessor, released in 2005 by Sun Microsystems, and now continued at Oracle, has a good track record in parallel execution and multi-threaded performance. However it was less suited for pure single-threaded workloads. The new SPARC T4 processor is now filling that gap by offering a 5x better single-thread performance over previous generations. Following our long-term relationship with Talend, a fast growing ISV positioned by Gartner in the “Visionaries” quadrant of the “Magic Quadrant for Data Integration Tools”, we decided to test some of their integration components with the T4 chip, more precisely on a T4-1 system, in order to verify first hand if this new processor stands up to its promises. Several tests were performed, mainly focused on: Single-thread performance of the new SPARC T4 processor compared to an older SPARC T2+ processor Overall throughput of the SPARC T4-1 server using multiple threads The tests consisted in reading large amounts of data --ten's of gigabytes--, processing and writing them back to a file or an Oracle 11gR2 database table. They are CPU, memory and IO bound tests. Given the main focus of this project --CPU performance--, bottlenecks were removed as much as possible on the memory and IO sub-systems. When possible, the data to process was put into the ZFS filesystem cache, for instance. Also, two external storage devices were directly attached to the servers under test, each one divided in two ZFS pools for read and write operations. Multi-thread: Testing throughput on the Oracle T4-1 The tests were performed with different number of simultaneous threads (1, 2, 4, 8, 12, 16, 32, 48 and 64) and using different storage devices: Flash, Fibre Channel storage, two stripped internal disks and one single internal disk. All storage devices used ZFS as filesystem and volume management. Each thread read a dedicated 1GB-large file containing 12.5M lines with the following structure: customerID;FirstName;LastName;StreetAddress;City;State;Zip;Cust_Status;Since_DT;Status_DT 1;Ronald;Reagan;South Highway;Santa Fe;Montana;98756;A;04-06-2006;09-08-2008 2;Theodore;Roosevelt;Timberlane Drive;Columbus;Louisiana;75677;A;10-05-2009;27-05-2008 3;Andrew;Madison;S Rustle St;Santa Fe;Arkansas;75677;A;29-04-2005;09-02-2008 4;Dwight;Adams;South Roosevelt Drive;Baton Rouge;Vermont;75677;A;15-02-2004;26-01-2007 […] The following graphs present the results of our tests: Unsurprisingly up to 16 threads, all files fit in the ZFS cache a.k.a L2ARC : once the cache is hot there is no performance difference depending on the underlying storage. From 16 threads upwards however, it is clear that IO becomes a bottleneck, having a good IO subsystem is thus key. Single-disk performance collapses whereas the Sun F5100 and ST6180 arrays allow the T4-1 to scale quite seamlessly. From 32 to 64 threads, the performance is almost constant with just a slow decline. For the database load tests, only the best IO configuration --using external storage devices-- were used, hosting the Oracle table spaces and redo log files. Using the Sun Storage F5100 array allows the T4-1 server to scale up to 48 parallel JVM processes before saturating the CPU. The final result is a staggering 646K lines per second insertion in an Oracle table using 48 parallel threads. Single-thread: Testing the single thread performance Seven different tests were performed on both servers. Given the fact that only one thread, thus one file was read, no IO bottleneck was involved, all data being served from the ZFS cache. Read File ? Filter ? Write File: Read file, filter data, write the filtered data in a new file. The filter is set on the “Status” column: only lines with status set to “A” are selected. This limits each output file to about 500 MB. Read File ? Load Database Table: Read file, insert into a single Oracle table. Average: Read file, compute the average of a numeric column, write the result in a new file. Division & Square Root: Read file, perform a division and square root on a numeric column, write the result data in a new file. Oracle DB Dump: Dump the content of an Oracle table (12.5M rows) into a CSV file. Transform: Read file, transform, write the result data in a new file. The transformations applied are: set the address column to upper case and add an extra column at the end, which is the concatenation of two columns. Sort: Read file, sort a numeric and alpha numeric column, write the result data in a new file. The following table and graph present the final results of the tests: Throughput unit is thousand lines per second processed (K lines/second). Improvement is the % of improvement between the T5140 and T4-1. Test T4-1 (Time s.) T5140 (Time s.) Improvement T4-1 (Throughput) T5140 (Throughput) Read/Filter/Write 125 806 645% 100 16 Read/Load Database 195 1111 570% 64 11 Average 96 557 580% 130 22 Division & Square Root 161 1054 655% 78 12 Oracle DB Dump 164 945 576% 76 13 Transform 159 1124 707% 79 11 Sort 251 1336 532% 50 9 The improvement of single-thread performance is quite dramatic: depending on the tests, the T4 is between 5.4 to 7 times faster than the T2+. It seems clear that the SPARC T4 processor has gone a long way filling the gap in single-thread performance, without sacrifying the multi-threaded capability as it still shows a very impressive scaling on heavy-duty multi-threaded jobs. Finally, as always at Oracle ISV Engineering, we are happy to help our ISV partners test their own applications on our platforms, so don't hesitate to contact us and let's see what the SPARC T4-based systems can do for your application! "As describe in this benchmark, Talend Enterprise Data Integration has overperformed on T4. I was generally happy to see that the T4 gave scaling opportunities for many scenarios like complex aggregations. Row by row insertion in Oracle DB is faster with more than 650,000 rows per seconds without using any bulk Oracle capabilities !" Cedric Carbone, Talend CTO.

    Read the article

  • SQL Table stored as a Heap - the dangers within

    - by MikeD
    Nearly all of the time I create a table, I include a primary key, and often that PK is implemented as a clustered index. Those two don't always have to go together, but in my world they almost always do. On a recent project, I was working on a data warehouse and a set of SSIS packages to import data from an OLTP database into my data warehouse. The data I was importing from the business database into the warehouse was mostly new rows, sometimes updates to existing rows, and sometimes deletes. I decided to use the MERGE statement to implement the insert, update or delete in the data warehouse, I found it quite performant to have a stored procedure that extracted all the new, updated, and deleted rows from the source database and dump it into a working table in my data warehouse, then run a stored proc in the warehouse that was the MERGE statement that took the rows from the working table and updated the real fact table. Use Warehouse CREATE TABLE Integration.MergePolicy (PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date, Operation varchar(5)) CREATE TABLE fact.Policy (PolicyKey int identity primary key, PolicyId int, PolicyTypeKey int, Premium money, Deductible money, EffectiveDate date) CREATE PROC Integration.MergePolicy as begin begin tran Merge fact.Policy as tgtUsing Integration.MergePolicy as SrcOn (tgt.PolicyId = Src.PolicyId) When not matched by Target then Insert (PolicyId, PolicyTypeKey, Premium, Deductible, EffectiveDate)values (src.PolicyId, src.PolicyTypeKey, src.Premium, src.Deductible, src.EffectiveDate) When matched and src.Operation = 'U' then Update set PolicyTypeKey = src.PolicyTypeKey,Premium = src.Premium,Deductible = src.Deductible,EffectiveDate = src.EffectiveDate When matched and src.Operation = 'D' then Delete ;delete from Integration.WorkPolicy commit end Notice that my worktable (Integration.MergePolicy) doesn't have any primary key or clustered index. I didn't think this would be a problem, since it was relatively small table and was empty after each time I ran the stored proc. For one of the work tables, during the initial loads of the warehouse, it was getting about 1.5 million rows inserted, processed, then deleted. Also, because of a bug in the extraction process, the same 1.5 million rows (plus a few hundred more each time) was getting inserted, processed, and deleted. This was being sone on a fairly hefty server that was otherwise unused, and no one was paying any attention to the time it was taking. This week I received a backup of this database and loaded it on my laptop to troubleshoot the problem, and of course it took a good ten minutes or more to run the process. However, what seemed strange to me was that after I fixed the problem and happened to run the merge sproc when the work table was completely empty, it still took almost ten minutes to complete. I immediately looked back at the MERGE statement to see if I had some sort of outer join that meant it would be scanning the target table (which had about 2 million rows in it), then turned on the execution plan output to see what was happening under the hood. Running the stored procedure again took a long time, and the plan output didn't show me much - 55% on the MERGE statement, and 45% on the DELETE statement, and table scans on the work table in both places. I was surprised at the relative cost of the DELETE statement, because there were really 0 rows to delete, but I was expecting to see the table scans. (I was beginning now to suspect that my problem was because the work table was being stored as a heap.) Then I turned on STATS_IO and ran the sproc again. The output was quite interesting.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'Policy'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.Table 'MergePolicy'. Scan count 1, logical reads 433276, physical reads 60, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. I've reproduced the above from memory, the details aren't exact, but the essential bit was the very high number of logical reads on the table stored as a heap. Even just doing a SELECT Count(*) from Integration.MergePolicy incurred that sort of output, even though the result was always 0. I suppose I should research more on the allocation and deallocation of pages to tables stored as a heap, but I haven't, and my original assumption that a table stored as a heap with no rows would only need to read one page to answer any query was definitely proven wrong. It's likely that some sort of physical defragmentation of the table may have cleaned that up, but it seemed that the easiest answer was to put a clustered index on the table. After doing so, the execution plan showed a cluster index scan, and the IO stats showed only a single page read. (I aborted my first attempt at adding a clustered index on the table because it was taking too long - instead I ran TRUNCATE TABLE Integration.MergePolicy first and added the clustered index, both of which took very little time). I suspect I may not have noticed this if I had used TRUNCATE TABLE Integration.MergePolicy instead of DELETE FROM Integration.MergePolicy, since I'm guessing that the truncate operation does some rather quick releasing of pages allocated to the heap table. In the future, I will likely be much more careful to have a clustered index on every table I use, even the working tables. Mike  

    Read the article

  • Breaking 1NF to model subset constraints. Does this sound sane?

    - by Chris Travers
    My first question here. Appologize if it is in the wrong forum but this seems pretty conceptual. I am looking at doing something that goes against conventional wisdom and want to get some feedback as to whether this is totally insane or will result in problems, so critique away! I am on PostgreSQL 9.1 but may be moving to 9.2 for this part of this project. To re-iterate: Does it seem sane to break 1NF in this way? I am not looking for debugging code so much as where people see problems that this might lead. The Problem In double entry accounting, financial transactions are journal entries with an arbitrary number of lines. Each line has either a left value (debit) or a right value (credit) which can be modelled as a single value with negatives as debits and positives as credits or vice versa. The sum of all debits and credits must equal zero (so if we go with a single amount field, sum(amount) must equal zero for each financial journal entry). SQL-based databases, pretty much required for this sort of work, have no way to express this sort of constraint natively and so any approach to enforcing it in the database seems rather complex. The Write Model The journal entries are append only. There is a possibility we will add a delete model but it will be subject to a different set of restrictions and so is not applicable here. If and when we allow deletes, we will probably do them using a simple ON DELETE CASCADE designation on the foreign key, and require that deletes go through a dedicated stored procedure which can enforce the other constraints. So inserts and selects have to be accommodated but updates and deletes do not for this task. My Proposed Solution My proposed solution is to break first normal form and model constraints on arrays of tuples, with a trigger that breaks the rows out into another table. CREATE TABLE journal_line ( entry_id bigserial primary key, account_id int not null references account(id), journal_entry_id bigint not null, -- adding references later amount numeric not null ); I would then add "table methods" to extract debits and credits for reporting purposes: CREATE OR REPLACE FUNCTION debits(journal_line) RETURNS numeric LANGUAGE sql IMMUTABLE AS $$ SELECT CASE WHEN $1.amount < 0 THEN $1.amount * -1 ELSE NULL END; $$; CREATE OR REPLACE FUNCTION credits(journal_line) RETURNS numeric LANGUAGE sql IMMUTABLE AS $$ SELECT CASE WHEN $1.amount > 0 THEN $1.amount ELSE NULL END; $$; Then the journal entry table (simplified for this example): CREATE TABLE journal_entry ( entry_id bigserial primary key, -- no natural keys :-( journal_id int not null references journal(id), date_posted date not null, reference text not null, description text not null, journal_lines journal_line[] not null ); Then a table method and and check constraints: CREATE OR REPLACE FUNCTION running_total(journal_entry) returns numeric language sql immutable as $$ SELECT sum(amount) FROM unnest($1.journal_lines); $$; ALTER TABLE journal_entry ADD CONSTRAINT CHECK (((journal_entry.running_total) = 0)); ALTER TABLE journal_line ADD FOREIGN KEY journal_entry_id REFERENCES journal_entry(entry_id); And finally we'd have a breakout trigger: CREATE OR REPLACE FUNCTION je_breakout() RETURNS TRIGGER LANGUAGE PLPGSQL AS $$ BEGIN IF TG_OP = 'INSERT' THEN INSERT INTO journal_line (journal_entry_id, account_id, amount) SELECT NEW.id, account_id, amount FROM unnest(NEW.journal_lines); RETURN NEW; ELSE RAISE EXCEPTION 'Operation Not Allowed'; END IF; END; $$; And finally CREATE TRIGGER AFTER INSERT OR UPDATE OR DELETE ON journal_entry FOR EACH ROW EXECUTE_PROCEDURE je_breaout(); Of course the example above is simplified. There will be a status table that will track approval status allowing for separation of duties, etc. However the goal here is to prevent unbalanced transactions. Any feedback? Does this sound entirely insane? Standard Solutions? In getting to this point I have to say I have looked at four different current ERP solutions to this problems: Represent every line item as a debit and a credit against different accounts. Use of foreign keys against the line item table to enforce an eventual running total of 0 Use of constraint triggers in PostgreSQL Forcing all validation here solely through the app logic. My concerns are that #1 is pretty limiting and very hard to audit internally. It's not programmer transparent and so it strikes me as being difficult to work with in the future. The second strikes me as being very complex and required a series of contraints and foreign keys against self to make work, and therefore it strikes me as complex, hard to sort out at least in my mind, and thus hard to work with. The fourth could be done as we force all access through stored procedures anyway and this is the most common solution (have the app total things up and throw an error otherwise). However, I think proof that a constraint is followed is superior to test cases, and so the question becomes whether this in fact generates insert anomilies rather than solving them. If this is a solved problem it isn't the case that everyone agrees on the solution....

    Read the article

  • k-d tree implementation [closed]

    - by user466441
    when i run my code and debugged,i got this error - this 0x00093584 {_Myproxy=0x00000000 _Mynextiter=0x00000000 } std::_Iterator_base12 * const - _Myproxy 0x00000000 {_Mycont=??? _Myfirstiter=??? } std::_Container_proxy * _Mycont CXX0017: Error: symbol "" not found _Myfirstiter CXX0030: Error: expression cannot be evaluated + _Mynextiter 0x00000000 {_Myproxy=??? _Mynextiter=??? } std::_Iterator_base12 * but i dont know what does it means,code is this #include<iostream> #include<vector> #include<algorithm> using namespace std; struct point { float x,y; }; vector<point>pointleft(4); vector<point>pointright(4); //we are going to implement two comparison function for x and y coordinates,we need it in calculation of median (we should sort vector //by x or y according to depth informaton,is depth even or odd. bool sortby_X(point &a,point &b) { return a.x<b.x; } bool sortby_Y(point &a,point &b) { return a.y<b.y; } //so i am going to implement to median finding algorithm,one for finding median by x and another find median by y point medianx(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_X); temp=points[(points.size()/2)]; return temp; } point mediany(vector<point>points) { point temp; sort(points.begin(),points.end(),sortby_Y); temp=points[(points.size()/2)]; return temp; } //now construct basic tree structure struct Tree { float x,y; Tree(point a) { x=a.x; y=a.y; } Tree *left; Tree *right; }; Tree * build_kd( Tree *root,vector<point>points,int depth) { point temp; if(points.size()==1)// that point is as a leaf { if(root==NULL) root=new Tree(points[0]); return root; } if(depth%2==0) { temp=medianx(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if (points[i].x<temp.x) pointleft[i]=points[i]; else pointright[i]=points[i]; } } else { temp=mediany(points); root=new Tree(temp); for(int i=0;i<points.size();i++) { if(points[i].y<temp.y) pointleft[i]=points[i]; else pointright[i]=points[i]; } } return build_kd(root->left,pointleft,depth+1); return build_kd(root->right,pointright,depth+1); } void print(Tree *root) { while(root!=NULL) { cout<<root->x<<" " <<root->y; print(root->left); print(root->right); } } int main() { int depth=0; Tree *root=NULL; vector<point>points(4); float x,y; int n=4; for(int i=0;i<n;i++) { cin>>x>>y; points[i].x=x; points[i].y=y; } root=build_kd(root,points,depth); print(root); return 0; } i am trying ti implement in c++ this pseudo code tuple function build_kd_tree(int depth, set points): if points contains only one point: return that point as a leaf. if depth is even: Calculate the median x-value. Create a set of points (pointsLeft) that have x-values less than the median. Create a set of points (pointsRight) that have x-values greater than or equal to the median. else: Calculate the median y-value. Create a set of points (pointsLeft) that have y-values less than the median. Create a set of points (pointsRight) that have y-values greater than or equal to the median. treeLeft = build_kd_tree(depth + 1, pointsLeft) treeRight = build_kd_tree(depth + 1, pointsRight) return(median, treeLeft, treeRight) please help me what this error means?

    Read the article

  • spliiting code in java-don't know what's wrong [closed]

    - by ???? ?????
    I'm writing a code to split a file into many files with a size specified in the code, and then it will join these parts later. The problem is with the joining code, it doesn't work and I can't figure what is wrong! This is my code: import java.io.*; import java.util.*; public class StupidSplit { static final int Chunk_Size = 10; static int size =0; public static void main(String[] args) throws IOException { String file = "b.txt"; int chunks = DivideFile(file); System.out.print((new File(file)).delete()); System.out.print(JoinFile(file, chunks)); } static boolean JoinFile(String fname, int nChunks) { /* * Joins the chunks together. Chunks have been divided using DivideFile * function so the last part of filename will ".partxxxx" Checks if all * parts are together by matching number of chunks found against * "nChunks", then joins the file otherwise throws an error. */ boolean successful = false; File currentDirectory = new File(System.getProperty("user.dir")); // File[] fileList = currentDirectory.listFiles(); /* populate only the files having extension like "partxxxx" */ List<File> lst = new ArrayList<File>(); // Arrays.sort(fileList); for (File file : fileList) { if (file.isFile()) { String fnm = file.getName(); int lastDot = fnm.lastIndexOf('.'); // add to list which match the name given by "fname" and have //"partxxxx" as extension" if (fnm.substring(0, lastDot).equalsIgnoreCase(fname) && (fnm.substring(lastDot + 1)).substring(0, 4).equals("part")) { lst.add(file); } } } /* * sort the list - it will be sorted by extension only because we have * ensured that list only contains those files that have "fname" and * "part" */ File[] files = (File[]) lst.toArray(new File[0]); Arrays.sort(files); System.out.println("size ="+files.length); System.out.println("hello"); /* Ensure that number of chunks match the length of array */ if (files.length == nChunks-1) { File ofile = new File(fname); FileOutputStream fos; FileInputStream fis; byte[] fileBytes; int bytesRead = 0; try { fos = new FileOutputStream(ofile,true); for (File file : files) { fis = new FileInputStream(file); fileBytes = new byte[(int) file.length()]; bytesRead = fis.read(fileBytes, 0, (int) file.length()); assert(bytesRead == fileBytes.length); assert(bytesRead == (int) file.length()); fos.write(fileBytes); fos.flush(); fileBytes = null; fis.close(); fis = null; } fos.close(); fos = null; } catch (FileNotFoundException fnfe) { System.out.println("Could not find file"); successful = false; return successful; } catch (IOException ioe) { System.out.println("Cannot write to disk"); successful = false; return successful; } /* ensure size of file matches the size given by server */ successful = (ofile.length() == StupidSplit.size) ? true : false; } else { successful = false; } return successful; } static int DivideFile(String fname) { File ifile = new File(fname); FileInputStream fis; String newName; FileOutputStream chunk; //int fileSize = (int) ifile.length(); double fileSize = (double) ifile.length(); //int nChunks = 0, read = 0, readLength = Chunk_Size; int nChunks = 0, read = 0, readLength = Chunk_Size; byte[] byteChunk; try { fis = new FileInputStream(ifile); StupidSplit.size = (int)ifile.length(); while (fileSize > 0) { if (fileSize <= Chunk_Size) { readLength = (int) fileSize; } byteChunk = new byte[readLength]; read = fis.read(byteChunk, 0, readLength); fileSize -= read; assert(read==byteChunk.length); nChunks++; //newName = fname + ".part" + Integer.toString(nChunks - 1); newName = String.format("%s.part%09d", fname, nChunks - 1); chunk = new FileOutputStream(new File(newName)); chunk.write(byteChunk); chunk.flush(); chunk.close(); byteChunk = null; chunk = null; } fis.close(); System.out.println(nChunks); // fis = null; } catch (FileNotFoundException fnfe) { System.out.println("Could not find the given file"); System.exit(-1); } catch (IOException ioe) { System.out .println("Error while creating file chunks. Exiting program"); System.exit(-1); }System.out.println(nChunks); return nChunks; } } }

    Read the article

  • Using VLOOKUP in Excel

    - by Mark Virtue
    VLOOKUP is one of Excel’s most useful functions, and it’s also one of the least understood.  In this article, we demystify VLOOKUP by way of a real-life example.  We’ll create a usable Invoice Template for a fictitious company. So what is VLOOKUP?  Well, of course it’s an Excel function.  This article will assume that the reader already has a passing understanding of Excel functions, and can use basic functions such as SUM, AVERAGE, and TODAY.  In its most common usage, VLOOKUP is a database function, meaning that it works with database tables – or more simply, lists of things in an Excel worksheet.  What sort of things?   Well, any sort of thing.  You may have a worksheet that contains a list of employees, or products, or customers, or CDs in your CD collection, or stars in the night sky.  It doesn’t really matter. Here’s an example of a list, or database.  In this case it’s a list of products that our fictitious company sells: Usually lists like this have some sort of unique identifier for each item in the list.  In this case, the unique identifier is in the “Item Code” column.  Note:  For the VLOOKUP function to work with a database/list, that list must have a column containing the unique identifier (or “key”, or “ID”), and that column must be the first column in the table.  Our sample database above satisfies this criterion. The hardest part of using VLOOKUP is understanding exactly what it’s for.  So let’s see if we can get that clear first: VLOOKUP retrieves information from a database/list based on a supplied instance of the unique identifier. Put another way, if you put the VLOOKUP function into a cell and pass it one of the unique identifiers from your database, it will return you one of the pieces of information associated with that unique identifier.  In the example above, you would pass VLOOKUP an item code, and it would return to you either the corresponding item’s description, its price, or its availability (its “In stock” quantity).  Which of these pieces of information will it pass you back?  Well, you get to decide this when you’re creating the formula. If all you need is one piece of information from the database, it would be a lot of trouble to go to to construct a formula with a VLOOKUP function in it.  Typically you would use this sort of functionality in a reusable spreadsheet, such as a template.  Each time someone enters a valid item code, the system would retrieve all the necessary information about the corresponding item. Let’s create an example of this:  An Invoice Template that we can reuse over and over in our fictitious company. First we start Excel… …and we create ourselves a blank invoice: This is how it’s going to work:  The person using the invoice template will fill in a series of item codes in column “A”, and the system will retrieve each item’s description and price, which will be used to calculate the line total for each item (assuming we enter a valid quantity). For the purposes of keeping this example simple, we will locate the product database on a separate sheet in the same workbook: In reality, it’s more likely that the product database would be located in a separate workbook.  It makes little difference to the VLOOKUP function, which doesn’t really care if the database is located on the same sheet, a different sheet, or a completely different workbook. In order to test the VLOOKUP formula we’re about to write, we first enter a valid item code into cell A11: Next, we move the active cell to the cell in which we want information retrieved from the database by VLOOKUP to be stored.  Interestingly, this is the step that most people get wrong.  To explain further:  We are about to create a VLOOKUP formula that will retrieve the description that corresponds to the item code in cell A11.  Where do we want this description put when we get it?  In cell B11, of course.  So that’s where we write the VLOOKUP formula – in cell B11. Select cell B11: We need to locate the list of all available functions that Excel has to offer, so that we can choose VLOOKUP and get some assistance in completing the formula.  This is found by first clicking the Formulas tab, and then clicking Insert Function:   A box appears that allows us to select any of the functions available in Excel.  To find the one we’re looking for, we could type a search term like “lookup” (because the function we’re interested in is a lookup function).  The system would return us a list of all lookup-related functions in Excel.  VLOOKUP is the second one in the list.  Select it an click OK… The Function Arguments box appears, prompting us for all the arguments (or parameters) needed in order to complete the VLOOKUP function.  You can think of this box as the function is asking us the following questions: What unique identifier are you looking up in the database? Where is the database? Which piece of information from the database, associated with the unique identifier, do you wish to have retrieved for you? The first three arguments are shown in bold, indicating that they are mandatory arguments (the VLOOKUP function is incomplete without them and will not return a valid value).  The fourth argument is not bold, meaning that it’s optional:   We will complete the arguments in order, top to bottom. The first argument we need to complete is the Lookup_value argument.  The function needs us to tell it where to find the unique identifier (the item code in this case) that it should be retuning the description of.  We must select the item code we entered earlier (in A11). Click on the selector icon to the right of the first argument: Then click once on the cell containing the item code (A11), and press Enter: The value of “A11” is inserted into the first argument. Now we need to enter a value for the Table_array argument.  In other words, we need to tell VLOOKUP where to find the database/list.  Click on the selector icon next to the second argument: Now locate the database/list and select the entire list – not including the header line.  The database is located on a separate worksheet, so we first click on that worksheet tab: Next we select the entire database, not including the header line: …and press Enter.  The range of cells that represents the database (in this case “’Product Database’!A2:D7”) is entered automatically for us into the second argument. Now we need to enter the third argument, Col_index_num.  We use this argument to specify to VLOOKUP which piece of information from the database, associate with our item code in A11, we wish to have returned to us.  In this particular example, we wish to have the item’s description returned to us.  If you look on the database worksheet, you’ll notice that the “Description” column is the second column in the database.  This means that we must enter a value of “2” into the Col_index_num box: It is important to note that that we are not entering a “2” here because the “Description” column is in the B column on that worksheet.  If the database happened to start in column K of the worksheet, we would still enter a “2” in this field. Finally, we need to decide whether to enter a value into the final VLOOKUP argument, Range_lookup.  This argument requires either a true or false value, or it should be left blank.  When using VLOOKUP with databases (as is true 90% of the time), then the way to decide what to put in this argument can be thought of as follows: If the first column of the database (the column that contains the unique identifiers) is sorted alphabetically/numerically in ascending order, then it’s possible to enter a value of true into this argument, or leave it blank. If the first column of the database is not sorted, or it’s sorted in descending order, then you must enter a value of false into this argument As the first column of our database is not sorted, we enter false into this argument: That’s it!  We’ve entered all the information required for VLOOKUP to return the value we need.  Click the OK button and notice that the description corresponding to item code “R99245” has been correctly entered into cell B11: The formula that was created for us looks like this: If we enter a different item code into cell A11, we will begin to see the power of the VLOOKUP function:  The description cell changes to match the new item code: We can perform a similar set of steps to get the item’s price returned into cell E11.  Note that the new formula must be created in cell E11.  The result will look like this: …and the formula will look like this: Note that the only difference between the two formulae is the third argument (Col_index_num) has changed from a “2” to a “3” (because we want data retrieved from the 3rd column in the database). If we decided to buy 2 of these items, we would enter a “2” into cell D11.  We would then enter a simple formula into cell F11 to get the line total: =D11*E11 …which looks like this… Completing the Invoice Template We’ve learned a lot about VLOOKUP so far.  In fact, we’ve learned all we’re going to learn in this article.  It’s important to note that VLOOKUP can be used in other circumstances besides databases.  This is less common, and may be covered in future How-To Geek articles. Our invoice template is not yet complete.  In order to complete it, we would do the following: We would remove the sample item code from cell A11 and the “2” from cell D11.  This will cause our newly created VLOOKUP formulae to display error messages: We can remedy this by judicious use of Excel’s IF() and ISBLANK() functions.  We change our formula from this…       =VLOOKUP(A11,’Product Database’!A2:D7,2,FALSE) …to this…       =IF(ISBLANK(A11),”",VLOOKUP(A11,’Product Database’!A2:D7,2,FALSE)) We would copy the formulas in cells B11, E11 and F11 down to the remainder of the item rows of the invoice.  Note that if we do this, the resulting formulas will no longer correctly refer to the database table.  We could fix this by changing the cell references for the database to absolute cell references.  Alternatively – and even better – we could create a range name for the entire product database (such as “Products”), and use this range name instead of the cell references.  The formula would change from this…       =IF(ISBLANK(A11),”",VLOOKUP(A11,’Product Database’!A2:D7,2,FALSE)) …to this…       =IF(ISBLANK(A11),”",VLOOKUP(A11,Products,2,FALSE)) …and then copy the formulas down to the rest of the invoice item rows. We would probably “lock” the cells that contain our formulae (or rather unlock the other cells), and then protect the worksheet, in order to ensure that our carefully constructed formulae are not accidentally overwritten when someone comes to fill in the invoice. We would save the file as a template, so that it could be reused by everyone in our company If we were feeling really clever, we would create a database of all our customers in another worksheet, and then use the customer ID entered in cell F5 to automatically fill in the customer’s name and address in cells B6, B7 and B8. If you would like to practice with VLOOKUP, or simply see our resulting Invoice Template, it can be downloaded from here. Similar Articles Productive Geek Tips Make Excel 2007 Print Gridlines In Workbook FileMake Excel 2007 Always Save in Excel 2003 FormatConvert Older Excel Documents to Excel 2007 FormatImport Microsoft Access Data Into ExcelChange the Default Font in Excel 2007 TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 Classic Cinema Online offers 100’s of OnDemand Movies OutSync will Sync Photos of your Friends on Facebook and Outlook Windows 7 Easter Theme YoWindoW, a real time weather screensaver Optimize your computer the Microsoft way Stormpulse provides slick, real time weather data

    Read the article

  • Beware Sneaky Reads with Unique Indexes

    - by Paul White NZ
    A few days ago, Sandra Mueller (twitter | blog) asked a question using twitter’s #sqlhelp hash tag: “Might SQL Server retrieve (out-of-row) LOB data from a table, even if the column isn’t referenced in the query?” Leaving aside trivial cases (like selecting a computed column that does reference the LOB data), one might be tempted to say that no, SQL Server does not read data you haven’t asked for.  In general, that’s quite correct; however there are cases where SQL Server might sneakily retrieve a LOB column… Example Table Here’s a T-SQL script to create that table and populate it with 1,000 rows: CREATE TABLE dbo.LOBtest ( pk INTEGER IDENTITY NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540);   WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( some_value, lob_data ) SELECT TOP (1000) N.n, @Data FROM Numbers N WHERE N.n <= 1000; Test 1: A Simple Update Let’s run a query to subtract one from every value in the some_value column: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; As you might expect, modifying this integer column in 1,000 rows doesn’t take very long, or use many resources.  The STATITICS IO and TIME output shows a total of 9 logical reads, and 25ms elapsed time.  The query plan is also very simple: Looking at the Clustered Index Scan, we can see that SQL Server only retrieves the pk and some_value columns during the scan: The pk column is needed by the Clustered Index Update operator to uniquely identify the row that is being changed.  The some_value column is used by the Compute Scalar to calculate the new value.  (In case you are wondering what the Top operator is for, it is used to enforce SET ROWCOUNT). Test 2: Simple Update with an Index Now let’s create a nonclustered index keyed on the some_value column, with lob_data as an included column: CREATE NONCLUSTERED INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); This is not a useful index for our simple update query; imagine that someone else created it for a different purpose.  Let’s run our update query again: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; We find that it now requires 4,014 logical reads and the elapsed query time has increased to around 100ms.  The extra logical reads (4 per row) are an expected consequence of maintaining the nonclustered index. The query plan is very similar to before (click to enlarge): The Clustered Index Update operator picks up the extra work of maintaining the nonclustered index. The new Compute Scalar operators detect whether the value in the some_value column has actually been changed by the update.  SQL Server may be able to skip maintaining the nonclustered index if the value hasn’t changed (see my previous post on non-updating updates for details).  Our simple query does change the value of some_data in every row, so this optimization doesn’t add any value in this specific case. The output list of columns from the Clustered Index Scan hasn’t changed from the one shown previously: SQL Server still just reads the pk and some_data columns.  Cool. Overall then, adding the nonclustered index hasn’t had any startling effects, and the LOB column data still isn’t being read from the table.  Let’s see what happens if we make the nonclustered index unique. Test 3: Simple Update with a Unique Index Here’s the script to create a new unique index, and drop the old one: CREATE UNIQUE NONCLUSTERED INDEX [UQ dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); GO DROP INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest; Remember that SQL Server only enforces uniqueness on index keys (the some_data column).  The lob_data column is simply stored at the leaf-level of the non-clustered index.  With that in mind, we might expect this change to make very little difference.  Let’s see: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; Whoa!  Now look at the elapsed time and logical reads: Scan count 1, logical reads 2016, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992.   CPU time = 172 ms, elapsed time = 16172 ms. Even with all the data and index pages in memory, the query took over 16 seconds to update just 1,000 rows, performing over 52,000 LOB logical reads (nearly 16,000 of those using read-ahead). Why on earth is SQL Server reading LOB data in a query that only updates a single integer column? The Query Plan The query plan for test 3 looks a bit more complex than before: In fact, the bottom level is exactly the same as we saw with the non-unique index.  The top level has heaps of new stuff though, which I’ll come to in a moment. You might be expecting to find that the Clustered Index Scan is now reading the lob_data column (for some reason).  After all, we need to explain where all the LOB logical reads are coming from.  Sadly, when we look at the properties of the Clustered Index Scan, we see exactly the same as before: SQL Server is still only reading the pk and some_value columns – so what’s doing the LOB reads? Updates that Sneakily Read Data We have to go as far as the Clustered Index Update operator before we see LOB data in the output list: [Expr1020] is a bit flag added by an earlier Compute Scalar.  It is set true if the some_value column has not been changed (part of the non-updating updates optimization I mentioned earlier). The Clustered Index Update operator adds two new columns: the lob_data column, and some_value_OLD.  The some_value_OLD column, as the name suggests, is the pre-update value of the some_value column.  At this point, the clustered index has already been updated with the new value, but we haven’t touched the nonclustered index yet. An interesting observation here is that the Clustered Index Update operator can read a column into the data flow as part of its update operation.  SQL Server could have read the LOB data as part of the initial Clustered Index Scan, but that would mean carrying the data through all the operations that occur prior to the Clustered Index Update.  The server knows it will have to go back to the clustered index row to update it, so it delays reading the LOB data until then.  Sneaky! Why the LOB Data Is Needed This is all very interesting (I hope), but why is SQL Server reading the LOB data?  For that matter, why does it need to pass the pre-update value of the some_value column out of the Clustered Index Update? The answer relates to the top row of the query plan for test 3.  I’ll reproduce it here for convenience: Notice that this is a wide (per-index) update plan.  SQL Server used a narrow (per-row) update plan in test 2, where the Clustered Index Update took care of maintaining the nonclustered index too.  I’ll talk more about this difference shortly. The Split/Sort/Collapse combination is an optimization, which aims to make per-index update plans more efficient.  It does this by breaking each update into a delete/insert pair, reordering the operations, removing any redundant operations, and finally applying the net effect of all the changes to the nonclustered index. Imagine we had a unique index which currently holds three rows with the values 1, 2, and 3.  If we run a query that adds 1 to each row value, we would end up with values 2, 3, and 4.  The net effect of all the changes is the same as if we simply deleted the value 1, and added a new value 4. By applying net changes, SQL Server can also avoid false unique-key violations.  If we tried to immediately update the value 1 to a 2, it would conflict with the existing value 2 (which would soon be updated to 3 of course) and the query would fail.  You might argue that SQL Server could avoid the uniqueness violation by starting with the highest value (3) and working down.  That’s fine, but it’s not possible to generalize this logic to work with every possible update query. SQL Server has to use a wide update plan if it sees any risk of false uniqueness violations.  It’s worth noting that the logic SQL Server uses to detect whether these violations are possible has definite limits.  As a result, you will often receive a wide update plan, even when you can see that no violations are possible. Another benefit of this optimization is that it includes a sort on the index key as part of its work.  Processing the index changes in index key order promotes sequential I/O against the nonclustered index. A side-effect of all this is that the net changes might include one or more inserts.  In order to insert a new row in the index, SQL Server obviously needs all the columns – the key column and the included LOB column.  This is the reason SQL Server reads the LOB data as part of the Clustered Index Update. In addition, the some_value_OLD column is required by the Split operator (it turns updates into delete/insert pairs).  In order to generate the correct index key delete operation, it needs the old key value. The irony is that in this case the Split/Sort/Collapse optimization is anything but.  Reading all that LOB data is extremely expensive, so it is sad that the current version of SQL Server has no way to avoid it. Finally, for completeness, I should mention that the Filter operator is there to filter out the non-updating updates. Beating the Set-Based Update with a Cursor One situation where SQL Server can see that false unique-key violations aren’t possible is where it can guarantee that only one row is being updated.  Armed with this knowledge, we can write a cursor (or the WHILE-loop equivalent) that updates one row at a time, and so avoids reading the LOB data: SET NOCOUNT ON; SET STATISTICS XML, IO, TIME OFF;   DECLARE @PK INTEGER, @StartTime DATETIME; SET @StartTime = GETUTCDATE();   DECLARE curUpdate CURSOR LOCAL FORWARD_ONLY KEYSET SCROLL_LOCKS FOR SELECT L.pk FROM LOBtest L ORDER BY L.pk ASC;   OPEN curUpdate;   WHILE (1 = 1) BEGIN FETCH NEXT FROM curUpdate INTO @PK;   IF @@FETCH_STATUS = -1 BREAK; IF @@FETCH_STATUS = -2 CONTINUE;   UPDATE dbo.LOBtest SET some_value = some_value - 1 WHERE CURRENT OF curUpdate; END;   CLOSE curUpdate; DEALLOCATE curUpdate;   SELECT DATEDIFF(MILLISECOND, @StartTime, GETUTCDATE()); That completes the update in 1280 milliseconds (remember test 3 took over 16 seconds!) I used the WHERE CURRENT OF syntax there and a KEYSET cursor, just for the fun of it.  One could just as well use a WHERE clause that specified the primary key value instead. Clustered Indexes A clustered index is the ultimate index with included columns: all non-key columns are included columns in a clustered index.  Let’s re-create the test table and data with an updatable primary key, and without any non-clustered indexes: IF OBJECT_ID(N'dbo.LOBtest', N'U') IS NOT NULL DROP TABLE dbo.LOBtest; GO CREATE TABLE dbo.LOBtest ( pk INTEGER NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540);   WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( pk, some_value, lob_data ) SELECT TOP (1000) N.n, N.n, @Data FROM Numbers N WHERE N.n <= 1000; Now here’s a query to modify the cluster keys: UPDATE dbo.LOBtest SET pk = pk + 1; The query plan is: As you can see, the Split/Sort/Collapse optimization is present, and we also gain an Eager Table Spool, for Halloween protection.  In addition, SQL Server now has no choice but to read the LOB data in the Clustered Index Scan: The performance is not great, as you might expect (even though there is no non-clustered index to maintain): Table 'LOBtest'. Scan count 1, logical reads 2011, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992.   Table 'Worktable'. Scan count 1, logical reads 2040, physical reads 0, read-ahead reads 0, lob logical reads 34000, lob physical reads 0, lob read-ahead reads 8000.   SQL Server Execution Times: CPU time = 483 ms, elapsed time = 17884 ms. Notice how the LOB data is read twice: once from the Clustered Index Scan, and again from the work table in tempdb used by the Eager Spool. If you try the same test with a non-unique clustered index (rather than a primary key), you’ll get a much more efficient plan that just passes the cluster key (including uniqueifier) around (no LOB data or other non-key columns): A unique non-clustered index (on a heap) works well too: Both those queries complete in a few tens of milliseconds, with no LOB reads, and just a few thousand logical reads.  (In fact the heap is rather more efficient). There are lots more fun combinations to try that I don’t have space for here. Final Thoughts The behaviour shown in this post is not limited to LOB data by any means.  If the conditions are met, any unique index that has included columns can produce similar behaviour – something to bear in mind when adding large INCLUDE columns to achieve covering queries, perhaps. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • Auto blocking attacking IP address

    - by dong
    This is to share my PowerShell code online. I original asked this question on MSDN forum (or TechNet?) here: http://social.technet.microsoft.com/Forums/en-US/winserversecurity/thread/f950686e-e3f8-4cf2-b8ec-2685c1ed7a77 In short, this is trying to find attacking IP address then add it into Firewall block rule. So I suppose: 1, You are running a Windows Server 2008 facing the Internet. 2, You need to have some port open for service, e.g. TCP 21 for FTP; TCP 3389 for Remote Desktop. You can see in my code I’m only dealing with these two since that’s what I opened. You can add further port number if you like, but the way to process might be different with these two. 3, I strongly suggest you use STRONG password and follow all security best practices, this ps1 code is NOT for adding security to your server, but reduce the nuisance from brute force attack, and make sys admin’s life easier: i.e. your FTP log won’t hold megabytes of nonsense, your Windows system log will not roll back and only can tell you what happened last month. 4, You are comfortable with setting up Windows Firewall rules, in my code, my rule has a name of “MY BLACKLIST”, you need to setup a similar one, and set it to BLOCK everything. 5, My rule is dangerous because it has the risk to block myself out as well. I do have a backup plan i.e. the DELL DRAC5 so that if that happens, I still can remote console to my server and reset the firewall. 6, By no means the code is perfect, the coding style, the use of PowerShell skills, the hard coded part, all can be improved, it’s just that it’s good enough for me already. It has been running on my server for more than 7 MONTHS. 7, Current code still has problem, I didn’t solve it yet, further on this point after the code. :)    #Dong Xie, March 2012  #my simple code to monitor attack and deal with it  #Windows Server 2008 Logon Type  #8: NetworkCleartext, i.e. FTP  #10: RemoteInteractive, i.e. RDP    $tick = 0;  "Start to run at: " + (get-date);    $regex1 = [regex] "192\.168\.100\.(?:101|102):3389\s+(\d+\.\d+\.\d+\.\d+)";  $regex2 = [regex] "Source Network Address:\t(\d+\.\d+\.\d+\.\d+)";    while($True) {   $blacklist = @();     "Running... (tick:" + $tick + ")"; $tick+=1;    #Port 3389  $a = @()  netstat -no | Select-String ":3389" | ? { $m = $regex1.Match($_); `    $ip = $m.Groups[1].Value; if ($m.Success -and $ip -ne "10.0.0.1") {$a = $a + $ip;} }  if ($a.count -gt 0) {    $ips = get-eventlog Security -Newest 1000 | Where-Object {$_.EventID -eq 4625 -and $_.Message -match "Logon Type:\s+10"} | foreach { `      $m = $regex2.Match($_.Message); $ip = $m.Groups[1].Value; $ip; } | Sort-Object | Tee-Object -Variable list | Get-Unique    foreach ($ip in $a) { if ($ips -contains $ip) {      if (-not ($blacklist -contains $ip)) {        $attack_count = ($list | Select-String $ip -SimpleMatch | Measure-Object).count;        "Found attacking IP on 3389: " + $ip + ", with count: " + $attack_count;        if ($attack_count -ge 20) {$blacklist = $blacklist + $ip;}      }      }    }  }      #FTP  $now = (Get-Date).AddMinutes(-5); #check only last 5 mins.     #Get-EventLog has built-in switch for EventID, Message, Time, etc. but using any of these it will be VERY slow.  $count = (Get-EventLog Security -Newest 1000 | Where-Object {$_.EventID -eq 4625 -and $_.Message -match "Logon Type:\s+8" -and `              $_.TimeGenerated.CompareTo($now) -gt 0} | Measure-Object).count;  if ($count -gt 50) #threshold  {     $ips = @();     $ips1 = dir "C:\inetpub\logs\LogFiles\FPTSVC2" | Sort-Object -Property LastWriteTime -Descending `       | select -First 1 | gc | select -Last 200 | where {$_ -match "An\+error\+occured\+during\+the\+authentication\+process."} `        | Select-String -Pattern "(\d+\.\d+\.\d+\.\d+)" | select -ExpandProperty Matches | select -ExpandProperty value | Group-Object `        | where {$_.Count -ge 10} | select -ExpandProperty Name;       $ips2 = dir "C:\inetpub\logs\LogFiles\FTPSVC3" | Sort-Object -Property LastWriteTime -Descending `       | select -First 1 | gc | select -Last 200 | where {$_ -match "An\+error\+occured\+during\+the\+authentication\+process."} `        | Select-String -Pattern "(\d+\.\d+\.\d+\.\d+)" | select -ExpandProperty Matches | select -ExpandProperty value | Group-Object `        | where {$_.Count -ge 10} | select -ExpandProperty Name;     $ips += $ips1; $ips += $ips2; $ips = $ips | where {$_ -ne "10.0.0.1"} | Sort-Object | Get-Unique;         foreach ($ip in $ips) {       if (-not ($blacklist -contains $ip)) {        "Found attacking IP on FTP: " + $ip;        $blacklist = $blacklist + $ip;       }     }  }        #Firewall change <# $current = (netsh advfirewall firewall show rule name="MY BLACKLIST" | where {$_ -match "RemoteIP"}).replace("RemoteIP:", "").replace(" ","").replace("/255.255.255.255",""); #inside $current there is no \r or \n need remove. foreach ($ip in $blacklist) { if (-not ($current -match $ip) -and -not ($ip -like "10.0.0.*")) {"Adding this IP into firewall blocklist: " + $ip; $c= 'netsh advfirewall firewall set rule name="MY BLACKLIST" new RemoteIP="{0},{1}"' -f $ip, $current; Invoke-Expression $c; } } #>    foreach ($ip in $blacklist) {    $fw=New-object –comObject HNetCfg.FwPolicy2; # http://blogs.technet.com/b/jamesone/archive/2009/02/18/how-to-manage-the-windows-firewall-settings-with-powershell.aspx    $myrule = $fw.Rules | where {$_.Name -eq "MY BLACKLIST"} | select -First 1; # Potential bug here?    if (-not ($myrule.RemoteAddresses -match $ip) -and -not ($ip -like "10.0.0.*"))      {"Adding this IP into firewall blocklist: " + $ip;         $myrule.RemoteAddresses+=(","+$ip);      }  }    Wait-Event -Timeout 30 #pause 30 secs    } # end of top while loop.   Further points: 1, I suppose the server is listening on port 3389 on server IP: 192.168.100.101 and 192.168.100.102, you need to replace that with your real IP. 2, I suppose you are Remote Desktop to this server from a workstation with IP: 10.0.0.1. Please replace as well. 3, The threshold for 3389 attack is 20, you don’t want to block yourself just because you typed your password wrong 3 times, you can change this threshold by your own reasoning. 4, FTP is checking the log for attack only to the last 5 mins, you can change that as well. 5, I suppose the server is serving FTP on both IP address and their LOG path are C:\inetpub\logs\LogFiles\FPTSVC2 and C:\inetpub\logs\LogFiles\FPTSVC3. Change accordingly. 6, FTP checking code is only asking for the last 200 lines of log, and the threshold is 10, change as you wish. 7, the code runs in a loop, you can set the loop time at the last line. To run this code, copy and paste to your editor, finish all the editing, get it to your server, and open an CMD window, then type powershell.exe –file your_powershell_file_name.ps1, it will start running, you can Ctrl-C to break it. This is what you see when it’s running: This is when it detected attack and adding the firewall rule: Regarding the design of the code: 1, There are many ways you can detect the attack, but to add an IP into a block rule is no small thing, you need to think hard before doing it, reason for that may include: You don’t want block yourself; and not blocking your customer/user, i.e. the good guy. 2, Thus for each service/port, I double check. For 3389, first it needs to show in netstat.exe, then the Event log; for FTP, first check the Event log, then the FTP log files. 3, At three places I need to make sure I’m not adding myself into the block rule. –ne with single IP, –like with subnet.   Now the final bit: 1, The code will stop working after a while (depends on how busy you are attacked, could be weeks, months, or days?!) It will throw Red error message in CMD, don’t Panic, it does no harm, but it also no longer blocking new attack. THE REASON is not confirmed with MS people: the COM object to manage firewall, you can only give it a list of IP addresses to the length of around 32KB I think, once it reaches the limit, you get the error message. 2, This is in fact my second solution to use the COM object, the first solution is still in the comment block for your reference, which is using netsh, that fails because being run from CMD, you can only throw it a list of IP to 8KB. 3, I haven’t worked the workaround yet, some ideas include: wrap that RemoteAddresses setting line with error checking and once it reaches the limit, use the newly detected IP to be the list, not appending to it. This basically reset your block rule to ground zero and lose the previous bad IPs. This does no harm as it sounds, because given a certain period has passed, any these bad IPs still not repent and continue the attack to you, it only got 30 seconds or 20 guesses of your password before you block it again. And there is the benefit that the bad IP may turn back to the good hands again, and you are not blocking a potential customer or your CEO’s home pc because once upon a time, it’s a zombie. Thus the ZEN of blocking: never block any IP for too long. 4, But if you insist to block the ugly forever, my other ideas include: You call MS support, ask them how can we set an arbitrary length of IP addresses in a rule; at least from my experiences at the Forum, they don’t know and they don’t care, because they think the dynamic blocking should be done by some expensive hardware. Or, from programming perspective, you can create a new rule once the old is full, then you’ll have MY BLACKLIST1, MY  BLACKLIST2, MY BLACKLIST3, … etc. Once in a while you can compile them together and start a business to sell your blacklist on the market! Enjoy the code! p.s. (PowerShell is REALLY REALLY GREAT!)

    Read the article

  • NSFetchedResultsController crashing on performFetch: when using a cache

    - by Oliver
    I make use of NSFetchedResultsController to display a bunch of objects, which are sectioned using dates. On a fresh install, it all works perfectly and the objects are displayed in the table view. However, it seems that when the app is relaunched I get a crash. I specify a cache when initialising the NSFetchedResultsController, and when I don't it works perfectly. Here is how I create my NSFetchedResultsController: - (NSFetchedResultsController *)results { // If we are not nil, stop here if (results != nil) return results; // Create the fetch request, entity and sort descriptors NSFetchRequest *fetch = [[NSFetchRequest alloc] init]; NSEntityDescription *entity = [NSEntityDescription entityForName:@"Event" inManagedObjectContext:self.managedObjectContext]; NSSortDescriptor *descriptor = [[NSSortDescriptor alloc] initWithKey:@"utc_start" ascending:YES]; NSArray *descriptors = [[NSArray alloc] initWithObjects:descriptor, nil]; // Set properties on the fetch [fetch setEntity:entity]; [fetch setSortDescriptors:descriptors]; // Create a fresh fetched results controller NSFetchedResultsController *fetched = [[NSFetchedResultsController alloc] initWithFetchRequest:fetch managedObjectContext:self.managedObjectContext sectionNameKeyPath:@"day" cacheName:@"Events"]; fetched.delegate = self; self.results = fetched; // Release objects and return our controller [fetched release]; [fetch release]; [descriptor release]; [descriptors release]; return results; } These are the messages I get when the app crashes: FATAL ERROR: The persistent cache of section information does not match the current configuration. You have illegally mutated the NSFetchedResultsController's fetch request, its predicate, or its sort descriptor without either disabling caching or using +deleteCacheWithName: *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'FATAL ERROR: The persistent cache of section information does not match the current configuration. You have illegally mutated the NSFetchedResultsController's fetch request, its predicate, or its sort descriptor without either disabling caching or using +deleteCacheWithName:' I really have no clue as to why it's saying that, as I don't believe I'm doing anything special that would cause this. The only potential issue is the section header (day), which I construct like this when creating a new object: // Set the new format [formatter setDateFormat:@"dd MMMM"]; // Set the day of the event [event setValue:[formatter stringFromDate:[event valueForKey:@"utc_start"]] forKey:@"day"]; Like I mentioned, all of this works fine if there is no cache involved. Any help appreciated!

    Read the article

  • Firefox Password script same PW auto login for 1000s devices Possible to use greasemonkey

    - by ritztech
    I currently work in a position that i have to manage and access 1000s of pages for troubleshooting and new setup.... and im trying to figure out a way for firefox or chrome to setup ANYtime it sees for instance a a web based page for equipment like (CISCO, Linksys, Sonicwall, T1 controllers) from the manufacture in the Title bar or from the originating page place a set up 2 - 3 passwords to auto log on with. 1st one of course being the most common so it logs in faster.. I access about 14 different web based products with passwords tied to each of them and if someway i can grab info stating that hey this company is cisco/sonicwall/linksys/hp/ log on with these set of 3-5 credintials. Using possible If then statements.... is that hard i saw some script files but not sure if its difficult because some apps use the MSG BOX built in feature and some use the form submit method built on the page unless i can have 2 different grease monkey scripts at the same time.... thanks.

    Read the article

  • Server-side DataTable Sorting in RichFaces

    - by sblundy
    I have a data table with a variable number of columns and a data scroller. How can I enable server side sorting? I prefer that it be fired by the user clicking the column header. <rich:datascroller for="instanceList" actionListener="#{pageDataModel.pageChange}"/> <rich:dataTable id="instanceList" rows="10" value="#{pageDataModel}" var="fieldValues" rowKeyVar="rowKey"> <rich:columns value="#{pageDataModel.columnNames}" var="column" index="idx"> <f:facet name="header"> <h:outputText value="#{column}"/> </f:facet> <h:outputText value="#{classFieldValues[idx]}" /> </rich:columns> </rich:dataTable> I already have a method on the bean for executing the sort. public void sort(int column)

    Read the article

  • Sorting deeply nested attributes in Rails

    - by Senthil
    I want to be able to drag and drag App model which is nested under Category model. http://railscasts.com/episodes/196-nested-model-form-part-1 's the Railscast I've tried to follow. Category controller def move params[:apps].each_with_index do |id, index| Category.last.apps.update(['position=?', index+1], ['id=?', Category.last.id]) end render :nothing => true end I'm able to sort Categories with something similar, but since I'm updating an attribute, I'm having trouble. def sort params[:categories].each_with_index do |id, index| Category.update_all(['position=?', index+1], ['id=?', id]) end render :nothing => true end Any help is appreciated.

    Read the article

  • Size Mismatch Between Swing Fonts and Printable Fonts in Java

    - by Caleb Rackliffe
    Hi All, I've got a panel displaying a JTextPane backed by a StyledDocument. When I print a string of text in, say Arial 16, the text it prints is of a different size than the Arial 16 Word prints. Is there some sort of flaw in the translation of Swing fonts to Windows system fonts or something of the sort that makes it difficult (or impossible) to print accurately? I can achieve an approximation by scaling down the size of my font before printing, but this never quite gets me the results I would like, as it's not possible in all cases to reproduce things like equivalent numbers of words on a line, etc. Has anybody run into this before?

    Read the article

  • Stack overflow in compojure web project

    - by Anders Rune Jensen
    Hi I've been playing around with clojure and have been using it to build a simple little audio player. The strange thing is that sometimes, maybe one out of twenty, when contacting the server I will get the following error: 2010-04-20 15:33:20.963::WARN: Error for /control java.lang.StackOverflowError at clojure.lang.RT.seq(RT.java:440) at clojure.core$seq__4245.invoke(core.clj:105) at clojure.core$filter__5084$fn__5086.invoke(core.clj:1794) at clojure.lang.LazySeq.sval(LazySeq.java:42) at clojure.lang.LazySeq.seq(LazySeq.java:56) at clojure.lang.RT.seq(RT.java:440) at clojure.core$seq__4245.invoke(core.clj:105) at clojure.core$filter__5084$fn__5086.invoke(core.clj:1794) at clojure.lang.LazySeq.sval(LazySeq.java:42) at clojure.lang.LazySeq.seq(LazySeq.java:56) at clojure.lang.RT.seq(RT.java:440) at clojure.core$seq__4245.invoke(core.clj:105) at clojure.core$filter__5084$fn__5086.invoke(core.clj:1794) at clojure.lang.LazySeq.sval(LazySeq.java:42) at clojure.lang.LazySeq.seq(LazySeq.java:56) at clojure.lang.RT.seq(RT.java:440) at clojure.core$seq__4245.invoke(core.clj:105) at clojure.core$filter__5084$fn__5086.invoke(core.clj:1794) at clojure.lang.LazySeq.sval(LazySeq.java:42) at clojure.lang.LazySeq.seq(LazySeq.java:56) at clojure.lang.RT.seq(RT.java:440) ... If I do it right after again it always works. So it appears to be related to timing or something. The code in question is: (defn add-track [t] (common/ref-add tracks t)) (defn add-collection [coll] (doseq [track coll] (add-track track))) and (defn ref-add [ref value] (dosync (ref-set ref (conj @ref value)))) where coll is extracted from this function: (defn tracks-by-album [album] (sort sort-tracks (filter #(= (:album %) album) @tracks))) which uses: (defn get-album-from-track [track] (seq/find-first #(= (:album track) (:name %)) @albums)) (defn sort-tracks [track1 track2] (cond (= (:album track1) (:album track2)) (cond (and (:album-track track1) (:album-track track2)) (< (:album-track track1) (:album-track track2)) :else 0) :else (> (:year (get-album-from-track track1)) (:year (get-album-from-track track2))))) it gets called more or less directly from the request I get in: (when-handle-command cmd params (audio/tracks-by-album decoded-name)) (defn when-handle-command [cmd params data] (println (str "handling command:" cmd)) ....) I never get the handling command in my log, so it must die when it does the tracks-by-album. so it does appear to be the tracks-by-album function from the stack trace. I just don't see why it sometimes works and sometimes doesn't. I say that it's tracks-by-album because it's the only function (including it's children) that does filter, as can be seen in the trace. All the source code is available at: http://code.google.com/p/mucomp/. It's my little hobby project to learn clojure and so far it's quite buggy (this is just one bug :)) so I havn't really liked to tell too many people about it yet :)

    Read the article

  • Suitable GUI for sorting rows at database level and/or WYSIWYG level?

    - by Kristoffer
    Consider an Explorer-like list view with a number of columns. The data is fetched from a database, and the rows can be sorted by clicking the column headers. When you click column A, you expect the fetched data to be sorted by A - at the database level ("ORDER BY" at the selected column). However, sometimes it is desirable to sort the data presented in the GUI - the visible data (WYSIWYG). How do you combine these two? E.g. How do you allow the user to sort both the fetched data and the data visible in the GUI? Have you seen a GUI that solves this elegantly?

    Read the article

  • Rack rSpec Controller Tests with Rack Middleware issue

    - by Roman Gonzalez
    Howdy, I'm having big trouble testing with rSpec's controller API. Right now I'm using a middleware authentication solution (Warden), and when I run the specs, the proxy added by the middleware is not there, and all the authentication tests are throwing NilPointerExceptions all over the place. It seems rSpec is not adding the middleware to the final app on purpose, and I would like to know if there is a way to monkey patch rSpec in order to make that go. I already tested the whole thing with cucumber, however this is a refactoring of an old authentication version and there is several Controller tests that depend on authentication logic in order to work. Thanks in advance.

    Read the article

  • multicolumn sorting of WinForms DataGridView

    - by Bi
    I have a DataGridView in a windows form with 3 columns: Serial number, Name, and Date-Time. The Name column will always have either of the two values: "name1" or "name2". I need to sort these columns such that the grid displays all the rows with name values in a specific order (first display all the "name1" rows and then all the "name2" rows). Within the "name1" rows, I want the rows to be sorted by the Date-Time. Please note programmatically, all the 3 columns are strings. For example, if I have the rows: 01 |Name1 | 2010-05-05 10:00 PM 02 |Name2 | 2010-05-02 08:00 AM 03 |Name2 | 2010-05-01 08:00 AM 04 |Name1 | 2010-05-01 11:00 AM 05 |Name1 | 2010-05-04 07:00 AM needs to be sorted as 04 |Name1 | 2010-05-01 11:00 AM 05 |Name1 | 2010-05-04 07:00 AM 01 |Name1 | 2010-05-05 10:00 PM 03 |Name2 | 2010-05-01 08:00 AM 02 |Name2 | 2010-05-02 08:00 AM I am not sure how to go about using the below: myGrid.Sort(.....,ListSortDirection.Ascending)

    Read the article

  • BSD Sockets don't behave in a iPhone 3G environment

    - by Kyle
    I noticed that many times while developing for an iPhone 3G, BSD socket functions will simply fail. I also noticed at the time, the 3G antenna wasn't even ON, nor was there WIFI Access to back up the network call (So it seems ridiculous that it doesn't turn on to support the network request).. This information was verified with an app from Apple in the SDK called Connectivity Test, or something of the sort. Basically if you load Safari or something, then quickly load up the App it would be fine.. Of course that's not ideal. Apparently, to apple, gethostbyname() or something of the sort is by no means a reason to turn on the Antenna. I contacted Apple about this, and they said that the BSD functions do not switch the Antenna on, but calling all of the Objective-C CFNetwork functions do. I want portable code, so is there a way to keep my existing BSD setup? I really dislike coding in Objective-C, so if anyone knows a work around, that would be awesome.

    Read the article

  • Query only the first detail record for each master record

    - by Neal S.
    If I have the following master-detail relationship: owner_tbl auto_tbl --------- -------- owner --- owner auto year And I have the following table data: owner_tbl auto_tbl --------- -------- john john, corvette, 1968 john, prius, 2008 james james, f-150, 2004 james, cadillac, 2002 james, accord, 2009 jeff jeff, tesla, 2010 jeff, hyundai, 1996 Now, I want to perform a query that returns the following result: john, corvette, 1968 jeff, hyundai, 1996 james, cadillac, 2002 The query should join the two tables, and sort all the records on the "year" field, but only return the first detail record for each master record. I know how to join the tables and sort on the "year" field, but it's not clear how (or if) I might be able to only retrieve the first joined record for each owner. Three related questions: Can I perform this kind of query using LINQ-to-SQL? Can I perform the query using T-SQL? Would it be best to just create a stored procedure for the query given its likely complexity?

    Read the article

  • Could somebody give me a high-level technical overview of WSGI details behind the scenes vs other we

    - by orokusaki
    Firstly: I understand what WSGI is and how to use it I understand what "other" methods (Apache mod-python, fcgi, et al) are, and how to use them I understand their practical differences What I don't understand is how each of the various "other" methods work compared to something like UWSGI, behind the scenes. Does your server (Nginx, etc) route the request to your WSGI application and UWSGI creates a new Python interpreter for each request routed to it? How much different is is from the other more traditional / monkey patched methods is WSGI (aside from the different, easier Python interface that WSGI offers)? What light bulb moment am I missing?

    Read the article

  • Multi-column sorting with an NSTableView bound to an NSArrayController

    - by Cinder6
    Hi there. I have an NSTableView that's bound to an NSArrayController. Everything gets loaded into the table correctly, and I have sort keys and selectors set for the various columns (this also works). What I'm trying to do is sort multiple columns when the user clicks one column header, but I can't find any way to do this. I could brute-force a way by overriding tableView:mouseDownInHeaderOfTableColumn:, but that seems an inelegant way of doing it. Is there a good way of doing multi-column sorting in an NSTableView using bindings?

    Read the article

  • SSL on Heroku / User Authentication Across Multiple Domains

    - by Euwyn
    Posted a previous question on this, but have a followup. I was trying to create a workaround to use SSL on the expensive custom domain. I'm willing to live with bumping a user to https://app.heroku.com from http://www.app.com for certain secure pages, and have monkey-patched SSL required to make this happen. However, now this issue is with making sure my User is logged in when I do so. As I understand, cookies aren't cross domain. Is there a way around this issue?

    Read the article

< Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75  | Next Page >