Search Results

Search found 186 results on 8 pages for 'yaakov davis'.

Page 8/8 | < Previous Page | 4 5 6 7 8

Your thoughts on Best Practices for Scientific Computing?

- by John Smith

A recent paper by Wilson et al (2014) pointed out 24 Best Practices for scientific programming. It's worth to have a look. I would like to hear opinions about these points from experienced programmers in scientific data analysis. Do you think these advices are helpful and practical? Or are they good only in an ideal world? Wilson G, Aruliah DA, Brown CT, Chue Hong NP, Davis M, Guy RT, Haddock SHD, Huff KD, Mitchell IM, Plumbley MD, Waugh B, White EP, Wilson P (2014) Best Practices for Scientific Computing. PLoS Biol 12:e1001745. http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001745 Box 1. Summary of Best Practices Write programs for people, not computers. (a) A program should not require its readers to hold more than a handful of facts in memory at once. (b) Make names consistent, distinctive, and meaningful. (c) Make code style and formatting consistent. Let the computer do the work. (a) Make the computer repeat tasks. (b) Save recent commands in a file for re-use. (c) Use a build tool to automate workflows. Make incremental changes. (a) Work in small steps with frequent feedback and course correction. (b) Use a version control system. (c) Put everything that has been created manually in version control. Don’t repeat yourself (or others). (a) Every piece of data must have a single authoritative representation in the system. (b) Modularize code rather than copying and pasting. (c) Re-use code instead of rewriting it. Plan for mistakes. (a) Add assertions to programs to check their operation. (b) Use an off-the-shelf unit testing library. (c) Turn bugs into test cases. (d) Use a symbolic debugger. Optimize software only after it works correctly. (a) Use a profiler to identify bottlenecks. (b) Write code in the highest-level language possible. Document design and purpose, not mechanics. (a) Document interfaces and reasons, not implementations. (b) Refactor code in preference to explaining how it works. (c) Embed the documentation for a piece of software in that software. Collaborate. (a) Use pre-merge code reviews. (b) Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems. (c) Use an issue tracking tool. I'm relatively new to serious programming for scientific data analysis. When I tried to write code for pilot analyses of some of my data last year, I encountered tremendous amount of bugs both in my code and data. Bugs and errors had been around me all the time, but this time it was somewhat overwhelming. I managed to crunch the numbers at last, but I thought I couldn't put up with this mess any longer. Some actions must be taken. Without a sophisticated guide like the article above, I started to adopt "defensive style" of programming since then. A book titled "The Art of Readable Code" helped me a lot. I deployed meticulous input validations or assertions for every function, renamed a lot of variables and functions for better readability, and extracted many subroutines as reusable functions. Recently, I introduced Git and SourceTree for version control. At the moment, because my co-workers are much more reluctant about these issues, the collaboration practices (8a,b,c) have not been introduced. Actually, as the authors admitted, because all of these practices take some amount of time and effort to introduce, it may be generally hard to persuade your reluctant collaborators to comply them. I think I'm asking your opinions because I still suffer from many bugs despite all my effort on many of these practices. Bug fix may be, or should be, faster than before, but I couldn't really measure the improvement. Moreover, much of my time has been invested on defence, meaning that I haven't actually done much data analysis (offence) these days. Where is the point I should stop at in terms of productivity? I've already deployed: 1a,b,c, 2a, 3a,b,c, 4b,c, 5a,d, 6a,b, 7a,7b I'm about to have a go at: 5b,c Not yet: 2b,c, 4a, 7c, 8a,b,c (I could not really see the advantage of using GNU make (2c) for my purpose. Could anyone tell me how it helps my work with MATLAB?)

Read the article
RSS Feeds currently on Simple-Talk

- by Andrew Clarke

There are a number of news-feeds for the Simple-Talk site, but for some reason they are well hidden. Whilst we set about reorganizing them, I thought it would be a good idea to list some of the more important ones. The most important one for almost all purposes is the Homepage RSS feed which represents the blogs and articles that are placed on the homepage. Main Site Feed representing the Homepage ..which is good for most purposes but won't always have all the blogs, or maybe it will occasionally miss an article. If you aren't interested in all the content, you can just use the RSS feeds that are more relevant to your interests. (We'll be increasing these categories soon) The newsfeed for SQL articles The .NET section newsfeed The newsfeed for Red Gate books The newsfeed for Opinion articles The SysAdmin section newsfeed if you want to get a more refined feed, then you can pick and choose from these feeds for each category so as to make up your custom news-feed in the SQL section, SQL Training Learn SQL Server Database Administration TSQL Programming SQL Server Performance Backup and Recovery SQL Tools SSIS SSRS (Reporting Services) in .NET there are... ASP.NET Windows Forms .NET Framework ,NET Performance Visual Studio .NET tools in Sysadmin there are Exchange General Virtualisation Unified Messaging Powershell in opinion, there is... Geek of the Week Opinion Pieces in Books, there is .NET Books SQL Books SysAdmin Books And all the blogs have got feeds. So although you can get all the blogs from here.. Main Blog Feed You can get individual RSS feeds.. AdamRG's Blog Alex.Davies's Blog AliceE's Blog Andrew Clarke's Blog Andrew Hunter's Blog Bart Read's Blog Ben Adderson's Blog BobCram's Blog bradmcgehee's Blog Brian Donahue's Blog Charles Brown's Blog Chris Massey's Blog CliveT's Blog Damon's Blog David Atkinson's Blog David Connell's Blog Dr Dionysus's Blog drsql's Blog FatherJack's Blog Flibble's Blog Gareth Marlow's Blog Helen Joyce's Blog James's Blog Jason Crease's Blog John Magnabosco's Blog Laila's Blog Lionel's Blog Matt Lee's Blog mikef's Blog Neil Davidson's Blog Nigel Morse's Blog Phil Factor's Blog red@work's Blog reka.burmeister's Blog Richard Mitchell's Blog RobbieT's Blog RobertChipperfield's Blog Rodney's Blog Roger Hart's Blog Simon Cooper's Blog Simon Galbraith's Blog TheFutureOfMonitoring's Blog Tim Ford's Blog Tom Crossman's Blog Tony Davis's Blog As well as these blogs, you also have the forums.... SQL Server for Beginners Forum Programming SQL Server Forum Administering SQL Server Forum .NET framework Forum .Windows Forms Forum ASP.NET Forum ADO.NET Forum

Read the article
SQL SERVER – PAGELATCH_DT, PAGELATCH_EX, PAGELATCH_KP, PAGELATCH_SH, PAGELATCH_UP – Wait Type – Day 12 of 28

- by pinaldave

This is another common wait type. However, I still frequently see people getting confused with PAGEIOLATCH_X and PAGELATCH_X wait types. Actually, there is a big difference between the two. PAGEIOLATCH is related to IO issues, while PAGELATCH is not related to IO issues but is oftentimes linked to a buffer issue. Before we delve deeper in this interesting topic, first let us understand what Latch is. Latches are internal SQL Server locks which can be described as very lightweight and short-term synchronization objects. Latches are not primarily to protect pages being read from disk into memory. It’s a synchronization object for any in-memory access to any portion of a log or data file.[Updated based on comment of Paul Randal] The difference between locks and latches is that locks seal all the involved resources throughout the duration of the transactions (and other processes will have no access to the object), whereas latches locks the resources during the time when the data is changed. This way, a latch is able to maintain the integrity of the data between storage engine and data cache. A latch is a short-living lock that is put on resources on buffer cache and in the physical disk when data is moved in either directions. As soon as the data is moved, the latch is released. Now, let us understand the wait stat type related to latches. From Book On-Line: PAGELATCH_DT Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Destroy mode. PAGELATCH_EX Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Exclusive mode. PAGELATCH_KP Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Keep mode. PAGELATCH_SH Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Shared mode. PAGELATCH_UP Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Update mode. PAGELATCH_X Explanation: When there is a contention of access of the in-memory pages, this wait type shows up. It is quite possible that some of the pages in the memory are of very high demand. For the SQL Server to access them and put a latch on the pages, it will have to wait. This wait type is usually created at the same time. Additionally, it is commonly visible when the TempDB has higher contention as well. If there are indexes that are heavily used, contention can be created as well, leading to this wait type. Reducing PAGELATCH_X wait: The following counters are useful to understand the status of the PAGELATCH: Average Latch Wait Time (ms): The wait time for latch requests that have to wait. Latch Waits/sec: This is the number of latch requests that could not be granted immediately. Total Latch Wait Time (ms): This is the total latch wait time for latch requests in the last second. If there is TempDB contention, I suggest that you read the blog post of Robert Davis right away. He has written an excellent blog post regarding how to find out TempDB contention. The same blog post explains the terms in the allocation of GAM, SGAM and PFS. If there was a TempDB contention, Paul Randal explains the optimal settings for the TempDB in his misconceptions series. Trace Flag 1118 can be useful but use it very carefully. I totally understand that this blog post is not as clear as my other blog posts. I suggest if this wait stats is on one of your higher wait type. Do leave a comment or send me an email and I will get back to you with my solution for your situation. May the looking at all other wait stats and types together become effective as this wait type can help suggest proper bottleneck in your system. Read all the post in the Wait Types and Queue series. Note: The information presented here is from my experience and there is no way that I claim it to be accurate. I suggest reading Book OnLine for further clarification. All the discussions of Wait Stats in this blog are generic and vary from system to system. It is recommended that you test this on a development server before implementing it to a production server. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

Read the article
Scripting out Contained Database Users

- by Argenis

Today’s blog post comes from a Twitter thread on which @SQLSoldier, @sqlstudent144 and @SQLTaiob were discussing the internals of contained database users. Unless you have been living under a rock, you’ve heard about the concept of contained users within a SQL Server database (hit the link if you have not). In this article I’d like to show you that you can, indeed, script out contained database users and recreate them on another database, as either contained users or as good old fashioned logins/server principals as well. Why would this be useful? Well, because you would not need to know the password for the user in order to recreate it on another instance. I know there is a limited number of scenarios where this would be necessary, but nonetheless I figured I’d throw this blog post to show how it can be done. A more obscure use case: with the password hash (which I’m about to show you how to obtain) you could also crack the password using a utility like hashcat, as highlighted on this SQLServerCentral article. The Investigation SQL Server uses System Base Tables to save the password hashes of logins and contained database users. For logins it uses sys.sysxlgns, whereas for contained database users it leverages sys.sysowners. I’ll show you what I do to figure this stuff out: I create a login/contained user, and then I immediately browse the transaction log with, for example, fn_dblog. It’s pretty obvious that only two base tables touched by the operation are sys.sysxlgns, and also sys.sysprivs – the latter is used to track permissions. If I connect to the DAC on my instance, I can query for the password hash of this login I’ve just created. A few interesting things about this hash. This was taken on my laptop, and I happen to be running SQL Server 2014 RTM CU2, which is the latest public build of SQL Server 2014 as of time of writing. In 2008 R2 and prior versions (back to 2000), the password hashes would start with 0x0100. The reason why this changed is because starting with SQL Server 2012 password hashes are kept using a SHA512 algorithm, as opposed to SHA-1 (used since 2000) or Snefru (used in 6.5 and 7.0). SHA-1 is nowadays deemed unsafe and is very easy to crack. For regular SQL logins, this information is exposed through the sys.sql_logins catalog view, so there is really no need to connect to the DAC to grab an SID/password hash pair. For contained database users, there is (currently) no method of obtaining SID or password hashes without connecting to the DAC. If we create a contained database user, this is what we get from the transaction log: Note that the System Base Table used in this case is sys.sysowners. sys.sysprivs is used as well, and again this is to track permissions. To query sys.sysowners, you would have to connect to the DAC, as I mentioned previously. And this is what you would get: There are other ways to figure out what SQL Server uses under the hood to store contained database user password hashes, like looking at the execution plan for a query to sys.dm_db_uncontained_entities (Thanks, Robert Davis!) SIDs, Logins, Contained Users, and Why You Care…Or Not. One of the reasons behind the existence of Contained Users was the concept of portability of databases: it is really painful to maintain Server Principals (Logins) synced across most shared-nothing SQL Server HA/DR technologies (Mirroring, Availability Groups, and Log Shipping). Often times you would need the Security Identifier (SID) of these logins to match across instances, and that meant that you had to fetch whatever SID was assigned to the login on the principal instance so you could recreate it on a secondary. With contained users you normally wouldn’t care about SIDs, as the users are always available (and synced, as long as synchronization takes place) across instances. Now you might be presented some particular requirement that might specify that SIDs synced between logins on certain instances and contained database users on other databases. How would you go about creating a contained database user with a specific SID? The answer is that you can’t do it directly, but there’s a little trick that would allow you to do it. Create a login with a specified SID and password hash, create a user for that server principal on a partially contained database, then migrate that user to contained using the system stored procedure sp_user_migrate_to_contained, then drop the login. CREATE LOGIN <login_name> WITH PASSWORD = <password_hash> HASHED, SID = <sid> ; GO USE <partially_contained_db>; GO CREATE USER <user_name> FROM LOGIN <login_name>; GO EXEC sp_migrate_user_to_contained @username = <user_name>, @rename = N’keep_name’, @disablelogin = N‘disable_login’; GO DROP LOGIN <login_name>; GO Here’s how this skeleton would look like in action: And now I have a contained user with a specified SID and password hash. In my example above, I renamed the user after migrated it to contained so that it is, hopefully, easier to understand. Enjoy!

Read the article
Linq to List and IEnumerable issues

- by Otaku

I am querying an HTML file with Linq. It looks something like this: <html> <body> <div class="Players"> <div class="role">Goalies</div> <div class="name">John Smith</div> <div class="name">Shawn Xie</div> <div class="role">Right Wings</div> <div class="name">Jack Davis</div> <div class="name">Carl Yuns</div> <div class="name">Wayne Gortonia</div> <div class="role">Centers</div> <div class="name">Lutz Gaspy</div> <div class="name">John Jacobs</div> </div </html> </body> What I'm trying to do is create a list of these folks like in a list of a structure called Players: Structure Players Public Name As String Public Position As String End Structure But I've quickly found out I don't really know what I'm doing when it comes to Linq. I've got this far my my queries: Dim goalieList = From d In player.Elements _ Where d.Value = "Goalies" _ Select From g In d.ElementsAfterSelf _ Take While (g.@class <> "role") _ Select New Players With {.Position = "Goalie", _ .Name = g.Value} Dim centersList = From d In player.Elements _ Where d.Value = "Centers" _ Select From g In d.ElementsAfterSelf _ Take While (g.@class <> "role") _ Select New Players With {.Position = "Centers", _ .Name = g.Value} Which gets me down to the the players by position, but then I can't do much with this afterwards the result type is System.Collections.Generic.IEnumerable(Of System.Collections.Generic.IEnumerable(Of Player)) What I want to do is add these two results to a new list, like: Dim playersList As List(Of Players) = Nothing playersList.AddRange(centersList) playersList.AddRange(goalieList) So that I can then query the list and use it. But it kicks the error: Unable to cast object of type 'WhereSelectEnumerableIterator2[System.Xml.Linq.XElement,System.Collections.Generic.IEnumerable1[Players]]' to type 'System.Collections.Generic.IEnumerable`1[Players]' As you can see, I may really have no idea how to work with all these objects/classes. Does anyone have any insight on what I may be doing wrong and how I can resolve it? RESOLVED: The Linq query needs to return a single iEnumerable, like this: Dim goalieList = From l In _ (From d In players.Elements _ Where d.Value = "Goalies" _ Select d.ElementsAfterSelf.TakeWhile(Function(f) f.@class <> "role")) _ Select New Players With {.Position = "Goalie", .Name = l.Value} and then use goalieList.ToList

Read the article
Listing common SQL Code Smells.

- by Phil Factor

Once you’ve done a number of SQL Code-reviews, you’ll know those signs in the code that all might not be well. These ’Code Smells’ are coding styles that don’t directly cause a bug, but are indicators that all is not well with the code. . Kent Beck and Massimo Arnoldi seem to have coined the phrase in the "OnceAndOnlyOnce" page of www.C2.com, where Kent also said that code "wants to be simple". Bad Smells in Code was an essay by Kent Beck and Martin Fowler, published as Chapter 3 of the book ‘Refactoring: Improving the Design of Existing Code’ (ISBN 978-0201485677) Although there are generic code-smells, SQL has its own particular coding habits that will alert the programmer to the need to re-factor what has been written. See Exploring Smelly Code and Code Deodorants for Code Smells by Nick Harrison for a grounding in Code Smells in C# I’ve always been tempted by the idea of automating a preliminary code-review for SQL. It would be so useful to trawl through code and pick up the various problems, much like the classic ‘Lint’ did for C, and how the Code Metrics plug-in for .NET Reflector by Jonathan 'Peli' de Halleux is used for finding Code Smells in .NET code. The problem is that few of the standard procedural code smells are relevant to SQL, and we need an agreed list of code smells. Merrilll Aldrich made a grand start last year in his blog Top 10 T-SQL Code Smells.However, I'd like to make a start by discovering if there is a general opinion amongst Database developers what the most important SQL Smells are. One can be a bit defensive about code smells. I will cheerfully write very long stored procedures, even though they are frowned on. I’ll use dynamic SQL occasionally. You can only use them as an aid for your own judgment and it is fine to ‘sign them off’ as being appropriate in particular circumstances. Also, whole classes of ‘code smells’ may be irrelevant for a particular database. The use of proprietary SQL, for example, is only a ‘code smell’ if there is a chance that the database will have to be ported to another RDBMS. The use of dynamic SQL is a risk only with certain security models. As the saying goes, a CodeSmell is a hint of possible bad practice to a pragmatist, but a sure sign of bad practice to a purist. Plamen Ratchev’s wonderful article Ten Common SQL Programming Mistakes lists some of these ‘code smells’ along with out-and-out mistakes, but there are more. The use of nested transactions, for example, isn’t entirely incorrect, even though the database engine ignores all but the outermost: but it does flag up the possibility that the programmer thinks that nested transactions are supported. If anything requires some sort of general agreement, the definition of code smells is one. I’m therefore going to make this Blog ‘dynamic, in that, if anyone twitters a suggestion with a #SQLCodeSmells tag (or sends me a twitter) I’ll update the list here. If you add a comment to the blog with a suggestion of what should be added or removed, I’ll do my best to oblige. In other words, I’ll try to keep this blog up to date. The name against each 'smell' is the name of the person who Twittered me, commented about or who has written about the 'smell'. it does not imply that they were the first ever to think of the smell! Use of deprecated syntax such as *= (Dave Howard) Denormalisation that requires the shredding of the contents of columns. (Merrill Aldrich) Contrived interfaces Use of deprecated datatypes such as TEXT/NTEXT (Dave Howard) Datatype mis-matches in predicates that rely on implicit conversion.(Plamen Ratchev) Using Correlated subqueries instead of a join (Dave_Levy/ Plamen Ratchev) The use of Hints in queries, especially NOLOCK (Dave Howard /Mike Reigler) Few or No comments. Use of functions in a WHERE clause. (Anil Das) Overuse of scalar UDFs (Dave Howard, Plamen Ratchev) Excessive ‘overloading’ of routines. The use of Exec xp_cmdShell (Merrill Aldrich) Excessive use of brackets. (Dave Levy) Lack of the use of a semicolon to terminate statements Use of non-SARGable functions on indexed columns in predicates (Plamen Ratchev) Duplicated code, or strikingly similar code. Misuse of SELECT * (Plamen Ratchev) Overuse of Cursors (Everyone. Special mention to Dave Levy & Adrian Hills) Overuse of CLR routines when not necessary (Sam Stange) Same column name in different tables with different datatypes. (Ian Stirk) Use of ‘broken’ functions such as ‘ISNUMERIC’ without additional checks. Excessive use of the WHILE loop (Merrill Aldrich) INSERT ... EXEC (Merrill Aldrich) The use of stored procedures where a view is sufficient (Merrill Aldrich) Not using two-part object names (Merrill Aldrich) Using INSERT INTO without specifying the columns and their order (Merrill Aldrich) Full outer joins even when they are not needed. (Plamen Ratchev) Huge stored procedures (hundreds/thousands of lines). Stored procedures that can produce different columns, or order of columns in their results, depending on the inputs. Code that is never used. Complex and nested conditionals WHILE (not done) loops without an error exit. Variable name same as the Datatype Vague identifiers. Storing complex data or list in a character map, bitmap or XML field User procedures with sp_ prefix (Aaron Bertrand)Views that reference views that reference views that reference views (Aaron Bertrand) Inappropriate use of sql_variant (Neil Hambly) Errors with identity scope using SCOPE_IDENTITY @@IDENTITY or IDENT_CURRENT (Neil Hambly, Aaron Bertrand) Schemas that involve multiple dated copies of the same table instead of partitions (Matt Whitfield-Atlantis UK) Scalar UDFs that do data lookups (poor man's join) (Matt Whitfield-Atlantis UK) Code that allows SQL Injection (Mladen Prajdic) Tables without clustered indexes (Matt Whitfield-Atlantis UK) Use of "SELECT DISTINCT" to mask a join problem (Nick Harrison) Multiple stored procedures with nearly identical implementation. (Nick Harrison) Excessive column aliasing may point to a problem or it could be a mapping implementation. (Nick Harrison) Joining "too many" tables in a query. (Nick Harrison) Stored procedure returning more than one record set. (Nick Harrison) A NOT LIKE condition (Nick Harrison) excessive "OR" conditions. (Nick Harrison) User procedures with sp_ prefix (Aaron Bertrand) Views that reference views that reference views that reference views (Aaron Bertrand) sp_OACreate or anything related to it (Bill Fellows) Prefixing names with tbl_, vw_, fn_, and usp_ ('tibbling') (Jeremiah Peschka) Aliases that go a,b,c,d,e... (Dave Levy/Diane McNurlan) Overweight Queries (e.g. 4 inner joins, 8 left joins, 4 derived tables, 10 subqueries, 8 clustered GUIDs, 2 UDFs, 6 case statements = 1 query) (Robert L Davis) Order by 3,2 (Dave Levy) MultiStatement Table functions which are then filtered 'Sel * from Udf() where Udf.Col = Something' (Dave Ballantyne) running a SQL 2008 system in SQL 2000 compatibility mode(John Stafford)

Read the article
Open source CMS for a university department

- by Greg Kuperberg

I realize that this type of question gets asked over and over again. Nonetheless, I want to ask a more specific version. I'm in a university math department. Long ago our sysadmins (or just one at the time) switched to a web content management system. At the time, Zope looked like an informed choice. We have used Zope for years, but at least in my opinion, it has always been a controversial decision. At the time I didn't understand why it was so important to have a web CMS. Now I see that it certainly is important, but I don't know that it should be Zope. The good (even necessary) features of Zope for us are: It's free and Linux-based. It is a true CMS and not something else (e.g. wiki or blog) It lets you write HTML and scripts. What I really don't like about Zope is that the outcome of using it is all-or-nothing in a lot of ways. At least in convenient use, it ends up dividing the enterprise into superusers who can do everything, and lusers who can't do anything (except write their own home pages in plain HTML). It has a huge user manual, which end users won't have time to read. Somehow with the access permissions, the simple thing to do is to let a few admins access all of the source and data and that's it. Since this is a math department, the user base varies from real novices to people who understand computers reasonably well. But as it stands, any change that involves Zope has to go through the sysadmins. When the sysadmins are in a hurry, sometimes they will also just add plain HTML pages to the web site instead of using the Zope framework. It doesn't help matters that Zope is fairly disk-intensive and fairly hype-intensive. Not to dwell on Zope too much, but I am wondering what is the right web CMS for a mixed user base of terminal novices, quick studies, and experienced users. Some users might want intermediate permissions, e.g. read permission but not write permission, or permission to change some subset of the pages or see some subset of the database tables. Also it should be Linux-based and open source and a little bit scalable, and of course widely used and well-supported is a good idea. I might guess that the answer is Drupal just because that was the general answer before, but I don't know if it is the right type of CMS for this purpose. (But note that Python is a relatively popular language in a math department, among other reasons because Sage is based on Python.) I can see that I didn't completely define the question and that people are guessing what type of site it is. It is the UC Davis Math Department. The main structure of the site is not suitable for a wiki and it is also not the same thing as a course environment like Moodle. Rather, the site is mostly structured as a generic medium-small enterprise. Some components of the site could be a wiki, Moodle, LaTeX plugin, Request Tracker, etc. However, the main issue is not these components. The main issue is that it would be better to decentralize management of the site. Right now, everything that is in the Zope CMS has to go through the sysadmins. Every other user in the department either has to put in a request to them, or write their own web pages with no help from Zope. There are two main reasons for this: (1) Other people in the department don't have time to read the Zope manual. (2) It's a hassle to set up intermediate permissions in Zope. However, there are other people in the department who know how to write computer programs and use markup languages. I wouldn't want a solution that assumes that users either can't be trusted with much more than drag-and-drop, or that they are IT professionals who sleep with documentation manuals. I'm wondering if Plone/Zope still has this quality, since certainly Zope by itself does. But I also wonder sometimes if common-sense flexibility is unfashionable these days, and that things in general have be either mindlessly easy or incredibly powerful.

Read the article
GIT: clone works, remote push doesn't. Remote repository over copssh.

- by Rui

Hi all, I've "setup-a-msysgit-server-with-copssh-on-windows", following Tim Davis' guide and I was now learning how to use the git commands, following Jason Meridth's guide, and I have managed to get everything working fine, but now I can't pass the push command. I have set the server and the client on the same machine (for now), win7-x64. Here is some info of how things are set up: CopSSH Folder : C:/SSH/ Local Home Folder : C:/Users/rvc/ Remote Home Folder: C:/SSH/home/rvc/ # aka /cygdrive/c/SSH/home/rvc/ git remote rep : C:/SSH/home/rvc/myapp.git # empty rep At '/SSH/home/rvc/.bashrc' and 'Users/rvc/.bashrc': export HOME=/cygdrive/c/SSH/home/rvc gitpath='/cygdrive/c/Program Files (x86)/Git/bin' gitcorepath='/cygdrive/c/Program Files (x86)/Git/libexec/git-core' PATH=${gitpath}:${gitcorepath}:${PATH} So, cloning works (everything bellow is done via "Git Bash here" :P): rvc@RVC-DESKTOP /c/code $ git clone ssh://[email protected]:5858/SSH/home/rvc/myapp.git Initialized empty Git repository in C:/code/myapp/.git/ warning: You appear to have cloned an empty repository. rvc@RVC-DESKTOP /c/code $ cd myapp rvc@RVC-DESKTOP /c/code/myapp (master) $ git remote -v origin ssh://[email protected]:5858/SSH/home/rvc/myapp.git (fetch) origin ssh://[email protected]:5858/SSH/home/rvc/myapp.git (push) Then I create a file: rvc@RVC-DESKTOP /c/code/myapp (master) $ touch test.file rvc@RVC-DESKTOP /c/code/myapp (master) $ ls test.file Try to push it and get this error: rvc@RVC-DESKTOP /c/code/myapp (master) $ git add test.file rvc@RVC-DESKTOP /c/code/myapp (master) $ GIT_TRACE=1 git push origin master trace: built-in: git 'push' 'origin' 'master' trace: run_command: 'C:\Users\rvc\bin\plink.exe' '-batch' '-P' '5858' '[email protected] 68.1.65' 'git-receive-pack '\''/SSH/home/rvc/myapp.git'\''' git: '/SSH/home/rvc/myapp.git' is not a git command. See 'git --help'. fatal: The remote end hung up unexpectedly "git: '/SSH/home/rvc/myapp.git' is not a git command. See 'git --help'." .. what?! So, can anybody help? If you need any aditional information that I've missed, just ask. Thanks in advanced. EDIT: RAAAGE!! I'm having the same problem again, but now with ssh: rvc@RVC-DESKTOP /c/code/myapp (master) $ GIT_TRACE=1 git push trace: built-in: git 'push' trace: run_command: 'ssh' '-p' '5885' '[email protected]' 'git-receive-pack '\''/ SSH/home/rvc/myapp.git'\''' git: '/SSH/home/rvc/myapp.git' is not a git command. See 'git --help'. fatal: The remote end hung up unexpectedly I've tried GUI push, and shows the same message. git: '/SSH/home/rvc/myapp.git' is not a git command. See 'git --help'. Pushing to ssh://[email protected]:5885/SSH/home/rvc/myapp.git fatal: The remote end hung up unexpectedly Here's the currents .bashrc: C:\Users\rvc.bashrc (I think this is used only by cygwin/git bash): export HOME=/c/SSH/home/rvc gitpath='/c/Program Files (x86)/Git/bin' gitcorepath='/c/Program Files (x86)/Git/libexec/git-core' export GIT_EXEC_PATH=${gitcorepath} PATH=${gitpath}:${gitcorepath}:${PATH} C:\SSH\home\rvc.bashrc (.. and this is used when git connects via ssh to the "remote" server): export HOME=/c/SSH/home/rvc gitpath='/cygdrive/c/Program Files (x86)/Git/bin' gitcorepath='/cygdrive/c/Program Files (x86)/Git/libexec/git-core' export GIT_EXEC_PATH=${gitcorepath} PATH=${gitpath}:${gitcorepath}:${PATH}

Read the article
Simple MSBuild Configuration: Updating Assemblies With A Version Number

- by srkirkland

When distributing a library you often run up against versioning problems, once facet of which is simply determining which version of that library your client is running. Of course, each project in your solution has an AssemblyInfo.cs file which provides, among other things, the ability to set the Assembly name and version number. Unfortunately, setting the assembly version here would require not only changing the version manually for each build (depending on your schedule), but keeping it in sync across all projects. There are many ways to solve this versioning problem, and in this blog post I’m going to try to explain what I think is the easiest and most flexible solution. I will walk you through using MSBuild to create a simple build script, and I’ll even show how to (optionally) integrate with a Team City build server. All of the code from this post can be found at https://github.com/srkirkland/BuildVersion. Create CommonAssemblyInfo.cs The first step is to create a common location for the repeated assembly info that is spread across all of your projects. Create a new solution-level file (I usually create a Build/ folder in the solution root, but anywhere reachable by all your projects will do) called CommonAssemblyInfo.cs. In here you can put any information common to all your assemblies, including the version number. An example CommonAssemblyInfo.cs is as follows: using System.Reflection; using System.Resources; using System.Runtime.InteropServices; [assembly: AssemblyCompany("University of California, Davis")] [assembly: AssemblyProduct("BuildVersionTest")] [assembly: AssemblyCopyright("Scott Kirkland & UC Regents")] [assembly: AssemblyConfiguration("")] [assembly: AssemblyTrademark("")] [assembly: ComVisible(false)] [assembly: AssemblyVersion("1.2.3.4")] //Will be replaced [assembly: NeutralResourcesLanguage("en-US")] .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Cleanup AssemblyInfo.cs & Link CommonAssemblyInfo.cs For each of your projects, you’ll want to clean up your assembly info to contain only information that is unique to that assembly – everything else will go in the CommonAssemblyInfo.cs file. For most of my projects, that just means setting the AssemblyTitle, though you may feel AssemblyDescription is warranted. An example AssemblyInfo.cs file is as follows: using System.Reflection; [assembly: AssemblyTitle("BuildVersionTest")] .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Next, you need to “link” the CommonAssemblyinfo.cs file into your projects right beside your newly lean AssemblyInfo.cs file. To do this, right click on your project and choose Add | Existing Item from the context menu. Navigate to your CommonAssemblyinfo.cs file but instead of clicking Add, click the little down-arrow next to add and choose “Add as Link.” You should see a little link graphic similar to this: We’ve actually reduced complexity a lot already, because if you build all of your assemblies will have the same common info, including the product name and our static (fake) assembly version. Let’s take this one step further and introduce a build script. Create an MSBuild file What we want from the build script (for now) is basically just to have the common assembly version number changed via a parameter (eventually to be passed in by the build server) and then for the project to build. Also we’d like to have a flexibility to define what build configuration to use (debug, release, etc). In order to find/replace the version number, we are going to use a Regular Expression to find and replace the text within your CommonAssemblyInfo.cs file. There are many other ways to do this using community build task add-ins, but since we want to keep it simple let’s just define the Regular Expression task manually in a new file, Build.tasks (this example taken from the NuGet build.tasks file). <?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="4.0" DefaultTargets="Go" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <UsingTask TaskName="RegexTransform" TaskFactory="CodeTaskFactory" AssemblyFile="$(MSBuildToolsPath)\Microsoft.Build.Tasks.v4.0.dll"> <ParameterGroup> <Items ParameterType="Microsoft.Build.Framework.ITaskItem[]" /> </ParameterGroup> <Task> <Using Namespace="System.IO" /> <Using Namespace="System.Text.RegularExpressions" /> <Using Namespace="Microsoft.Build.Framework" /> <Code Type="Fragment" Language="cs"> <![CDATA[ foreach(ITaskItem item in Items) { string fileName = item.GetMetadata("FullPath"); string find = item.GetMetadata("Find"); string replaceWith = item.GetMetadata("ReplaceWith"); if(!File.Exists(fileName)) { Log.LogError(null, null, null, null, 0, 0, 0, 0, String.Format("Could not find version file: {0}", fileName), new object[0]); } string content = File.ReadAllText(fileName); File.WriteAllText( fileName, Regex.Replace( content, find, replaceWith ) ); } ]]> </Code> </Task> </UsingTask> </Project> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } If you glance at the code, you’ll see it’s really just going a Regex.Replace() on a given file, which is exactly what we need. Now we are ready to write our build file, called (by convention) Build.proj. <?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="4.0" DefaultTargets="Go" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <Import Project="$(MSBuildProjectDirectory)\Build.tasks" /> <PropertyGroup> <Configuration Condition="'$(Configuration)' == ''">Debug</Configuration> <SolutionRoot>$(MSBuildProjectDirectory)</SolutionRoot> </PropertyGroup> <ItemGroup> <RegexTransform Include="$(SolutionRoot)\CommonAssemblyInfo.cs"> <Find>(?<major>\d+)\.(?<minor>\d+)\.\d+\.(?<revision>\d+)</Find> <ReplaceWith>$(BUILD_NUMBER)</ReplaceWith> </RegexTransform> </ItemGroup> <Target Name="Go" DependsOnTargets="UpdateAssemblyVersion; Build"> </Target> <Target Name="UpdateAssemblyVersion" Condition="'$(BUILD_NUMBER)' != ''"> <RegexTransform Items="@(RegexTransform)" /> </Target> <Target Name="Build"> <MSBuild Projects="$(SolutionRoot)\BuildVersionTest.sln" Targets="Build" /> </Target> </Project> Reviewing this MSBuild file, we see that by default the “Go” target will be called, which in turn depends on “UpdateAssemblyVersion” and then “Build.” We go ahead and import the Bulid.tasks file and then setup some handy properties for setting the build configuration and solution root (in this case, my build files are in the solution root, but we might want to create a Build/ directory later). The rest of the file flows logically, we setup the RegexTransform to match version numbers such as <major>.<minor>.1.<revision> (1.2.3.4 in our example) and replace it with a $(BUILD_NUMBER) parameter which will be supplied externally. The first target, “UpdateAssemblyVersion” just runs the RegexTransform, and the second target, “Build” just runs the default MSBuild on our solution. Testing the MSBuild file locally Now we have a build file which can replace assembly version numbers and build, so let’s setup a quick batch file to be able to build locally. To do this you simply create a file called Build.cmd and have it call MSBuild on your Build.proj file. I’ve added a bit more flexibility so you can specify build configuration and version number, which makes your Build.cmd look as follows: set config=%1 if "%config%" == "" ( set config=debug ) set version=%2 if "%version%" == "" ( set version=2.3.4.5 ) %WINDIR%\Microsoft.NET\Framework\v4.0.30319\msbuild Build.proj /p:Configuration="%config%" /p:build_number="%version%" .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Now if you click on the Build.cmd file, you will get a default debug build using the version 2.3.4.5. Let’s run it in a command window with the parameters set for a release build version 2.0.1.453. Excellent! We can now run one simple command and govern the build configuration and version number of our entire solution. Each DLL produced will have the same version number, making determining which version of a library you are running very simple and accurate. Configure the build server (TeamCity) Of course you are not really going to want to run a build command manually every time, and typing in incrementing version numbers will also not be ideal. A good solution is to have a computer (or set of computers) act as a build server and build your code for you, providing you a consistent environment, excellent reporting, and much more. One of the most popular Build Servers is JetBrains’ TeamCity, and this last section will show you the few configuration parameters to use when setting up a build using your MSBuild file created earlier. If you are using a different build server, the same principals should apply. First, when setting up the project you want to specify the “Build Number Format,” often given in the form <major>.<minor>.<revision>.<build>. In this case you will set major/minor manually, and optionally revision (or you can use your VCS revision number with %build.vcs.number%), and then build using the {0} wildcard. Thus your build number format might look like this: 2.0.1.{0}. During each build, this value will be created and passed into the $BUILD_NUMBER variable of our Build.proj file, which then uses it to decorate your assemblies with the proper version. After setting up the build number, you must choose MSBuild as the Build Runner, then provide a path to your build file (Build.proj). After specifying your MSBuild Version (equivalent to your .NET Framework Version), you have the option to specify targets (the default being “Go”) and additional MSBuild parameters. The one parameter that is often useful is manually setting the configuration property (/p:Configuration="Release") if you want something other than the default (which is Debug in our example). Your resulting configuration will look something like this: [Under General Settings] [Build Runner Settings] Now every time your build is run, a newly incremented build version number will be generated and passed to MSBuild, which will then version your assemblies and build your solution. A Quick Review Our goal was to version our output assemblies in an automated way, and we accomplished it by performing a few quick steps: Move the common assembly information, including version, into a linked CommonAssemblyInfo.cs file Create a simple MSBuild script to replace the common assembly version number and build your solution Direct your build server to use the created MSBuild script That’s really all there is to it. You can find all of the code from this post at https://github.com/srkirkland/BuildVersion. Enjoy!

Read the article
Toorcon 15 (2013)

- by danx

The Toorcon gang (senior staff): h1kari (founder), nfiltr8, and Geo Introduction to Toorcon 15 (2013) A Tale of One Software Bypass of MS Windows 8 Secure Boot Breaching SSL, One Byte at a Time Running at 99%: Surviving an Application DoS Security Response in the Age of Mass Customized Attacks x86 Rewriting: Defeating RoP and other Shinanighans Clowntown Express: interesting bugs and running a bug bounty program Active Fingerprinting of Encrypted VPNs Making Attacks Go Backwards Mask Your Checksums—The Gorry Details Adventures with weird machines thirty years after "Reflections on Trusting Trust" Introduction to Toorcon 15 (2013) Toorcon 15 is the 15th annual security conference held in San Diego. I've attended about a third of them and blogged about previous conferences I attended here starting in 2003. As always, I've only summarized the talks I attended and interested me enough to write about them. Be aware that I may have misrepresented the speaker's remarks and that they are not my remarks or opinion, or those of my employer, so don't quote me or them. Those seeking further details may contact the speakers directly or use The Google. For some talks, I have a URL for further information. A Tale of One Software Bypass of MS Windows 8 Secure Boot Andrew Furtak and Oleksandr Bazhaniuk Yuri Bulygin, Oleksandr ("Alex") Bazhaniuk, and (not present) Andrew Furtak Yuri and Alex talked about UEFI and Bootkits and bypassing MS Windows 8 Secure Boot, with vendor recommendations. They previously gave this talk at the BlackHat 2013 conference. MS Windows 8 Secure Boot Overview UEFI (Unified Extensible Firmware Interface) is interface between hardware and OS. UEFI is processor and architecture independent. Malware can replace bootloader (bootx64.efi, bootmgfw.efi). Once replaced can modify kernel. Trivial to replace bootloader. Today many legacy bootkits—UEFI replaces them most of them. MS Windows 8 Secure Boot verifies everything you load, either through signatures or hashes. UEFI firmware relies on secure update (with signed update). You would think Secure Boot would rely on ROM (such as used for phones0, but you can't do that for PCs—PCs use writable memory with signatures DXE core verifies the UEFI boat loader(s) OS Loader (winload.efi, winresume.efi) verifies the OS kernel A chain of trust is established with a root key (Platform Key, PK), which is a cert belonging to the platform vendor. Key Exchange Keys (KEKs) verify an "authorized" database (db), and "forbidden" database (dbx). X.509 certs with SHA-1/SHA-256 hashes. Keys are stored in non-volatile (NV) flash-based NVRAM. Boot Services (BS) allow adding/deleting keys (can't be accessed once OS starts—which uses Run-Time (RT)). Root cert uses RSA-2048 public keys and PKCS#7 format signatures. SecureBoot — enable disable image signature checks SetupMode — update keys, self-signed keys, and secure boot variables CustomMode — allows updating keys Secure Boot policy settings are: always execute, never execute, allow execute on security violation, defer execute on security violation, deny execute on security violation, query user on security violation Attacking MS Windows 8 Secure Boot Secure Boot does NOT protect from physical access. Can disable from console. Each BIOS vendor implements Secure Boot differently. There are several platform and BIOS vendors. It becomes a "zoo" of implementations—which can be taken advantage of. Secure Boot is secure only when all vendors implement it correctly. Allow only UEFI firmware signed updates protect UEFI firmware from direct modification in flash memory protect FW update components program SPI controller securely protect secure boot policy settings in nvram protect runtime api disable compatibility support module which allows unsigned legacy Can corrupt the Platform Key (PK) EFI root certificate variable in SPI flash. If PK is not found, FW enters setup mode wich secure boot turned off. Can also exploit TPM in a similar manner. One is not supposed to be able to directly modify the PK in SPI flash from the OS though. But they found a bug that they can exploit from User Mode (undisclosed) and demoed the exploit. It loaded and ran their own bootkit. The exploit requires a reboot. Multiple vendors are vulnerable. They will disclose this exploit to vendors in the future. Recommendations: allow only signed updates protect UEFI fw in ROM protect EFI variable store in ROM Breaching SSL, One Byte at a Time Yoel Gluck and Angelo Prado Angelo Prado and Yoel Gluck, Salesforce.com CRIME is software that performs a "compression oracle attack." This is possible because the SSL protocol doesn't hide length, and because SSL compresses the header. CRIME requests with every possible character and measures the ciphertext length. Look for the plaintext which compresses the most and looks for the cookie one byte-at-a-time. SSL Compression uses LZ77 to reduce redundancy. Huffman coding replaces common byte sequences with shorter codes. US CERT thinks the SSL compression problem is fixed, but it isn't. They convinced CERT that it wasn't fixed and they issued a CVE. BREACH, breachattrack.com BREACH exploits the SSL response body (Accept-Encoding response, Content-Encoding). It takes advantage of the fact that the response is not compressed. BREACH uses gzip and needs fairly "stable" pages that are static for ~30 seconds. It needs attacker-supplied content (say from a web form or added to a URL parameter). BREACH listens to a session's requests and responses, then inserts extra requests and responses. Eventually, BREACH guesses a session's secret key. Can use compression to guess contents one byte at-a-time. For example, "Supersecret SupersecreX" (a wrong guess) compresses 10 bytes, and "Supersecret Supersecret" (a correct guess) compresses 11 bytes, so it can find each character by guessing every character. To start the guess, BREACH needs at least three known initial characters in the response sequence. Compression length then "leaks" information. Some roadblocks include no winners (all guesses wrong) or too many winners (multiple possibilities that compress the same). The solutions include: lookahead (guess 2 or 3 characters at-a-time instead of 1 character). Expensive rollback to last known conflict check compression ratio can brute-force first 3 "bootstrap" characters, if needed (expensive) block ciphers hide exact plain text length. Solution is to align response in advance to block size Mitigations length: use variable padding secrets: dynamic CSRF tokens per request secret: change over time separate secret to input-less servlets Future work eiter understand DEFLATE/GZIP HTTPS extensions Running at 99%: Surviving an Application DoS Ryan Huber Ryan Huber, Risk I/O Ryan first discussed various ways to do a denial of service (DoS) attack against web services. One usual method is to find a slow web page and do several wgets. Or download large files. Apache is not well suited at handling a large number of connections, but one can put something in front of it Can use Apache alternatives, such as nginx How to identify malicious hosts short, sudden web requests user-agent is obvious (curl, python) same url requested repeatedly no web page referer (not normal) hidden links. hide a link and see if a bot gets it restricted access if not your geo IP (unless the website is global) missing common headers in request regular timing first seen IP at beginning of attack count requests per hosts (usually a very large number) Use of captcha can mitigate attacks, but you'll lose a lot of genuine users. Bouncer, goo.gl/c2vyEc and www.github.com/rawdigits/Bouncer Bouncer is software written by Ryan in netflow. Bouncer has a small, unobtrusive footprint and detects DoS attempts. It closes blacklisted sockets immediately (not nice about it, no proper close connection). Aggregator collects requests and controls your web proxies. Need NTP on the front end web servers for clean data for use by bouncer. Bouncer is also useful for a popularity storm ("Slashdotting") and scraper storms. Future features: gzip collection data, documentation, consumer library, multitask, logging destroyed connections. Takeaways: DoS mitigation is easier with a complete picture Bouncer designed to make it easier to detect and defend DoS—not a complete cure Security Response in the Age of Mass Customized Attacks Peleus Uhley and Karthik Raman Peleus Uhley and Karthik Raman, Adobe ASSET, blogs.adobe.com/asset/ Peleus and Karthik talked about response to mass-customized exploits. Attackers behave much like a business. "Mass customization" refers to concept discussed in the book Future Perfect by Stan Davis of Harvard Business School. Mass customization is differentiating a product for an individual customer, but at a mass production price. For example, the same individual with a debit card receives basically the same customized ATM experience around the world. Or designing your own PC from commodity parts. Exploit kits are another example of mass customization. The kits support multiple browsers and plugins, allows new modules. Exploit kits are cheap and customizable. Organized gangs use exploit kits. A group at Berkeley looked at 77,000 malicious websites (Grier et al., "Manufacturing Compromise: The Emergence of Exploit-as-a-Service", 2012). They found 10,000 distinct binaries among them, but derived from only a dozen or so exploit kits. Characteristics of Mass Malware: potent, resilient, relatively low cost Technical characteristics: multiple OS, multipe payloads, multiple scenarios, multiple languages, obfuscation Response time for 0-day exploits has gone down from ~40 days 5 years ago to about ~10 days now. So the drive with malware is towards mass customized exploits, to avoid detection There's plenty of evicence that exploit development has Project Manager bureaucracy. They infer from the malware edicts to: support all versions of reader support all versions of windows support all versions of flash support all browsers write large complex, difficult to main code (8750 lines of JavaScript for example Exploits have "loose coupling" of multipe versions of software (adobe), OS, and browser. This allows specific attacks against specific versions of multiple pieces of software. Also allows exploits of more obscure software/OS/browsers and obscure versions. Gave examples of exploits that exploited 2, 3, 6, or 14 separate bugs. However, these complete exploits are more likely to be buggy or fragile in themselves and easier to defeat. Future research includes normalizing malware and Javascript. Conclusion: The coming trend is that mass-malware with mass zero-day attacks will result in mass customization of attacks. x86 Rewriting: Defeating RoP and other Shinanighans Richard Wartell Richard Wartell The attack vector we are addressing here is: First some malware causes a buffer overflow. The malware has no program access, but input access and buffer overflow code onto stack Later the stack became non-executable. The workaround malware used was to write a bogus return address to the stack jumping to malware Later came ASLR (Address Space Layout Randomization) to randomize memory layout and make addresses non-deterministic. The workaround malware used was to jump t existing code segments in the program that can be used in bad ways "RoP" is Return-oriented Programming attacks. RoP attacks use your own code and write return address on stack to (existing) expoitable code found in program ("gadgets"). Pinkie Pie was paid $60K last year for a RoP attack. One solution is using anti-RoP compilers that compile source code with NO return instructions. ASLR does not randomize address space, just "gadgets". IPR/ILR ("Instruction Location Randomization") randomizes each instruction with a virtual machine. Richard's goal was to randomize a binary with no source code access. He created "STIR" (Self-Transofrming Instruction Relocation). STIR disassembles binary and operates on "basic blocks" of code. The STIR disassembler is conservative in what to disassemble. Each basic block is moved to a random location in memory. Next, STIR writes new code sections with copies of "basic blocks" of code in randomized locations. The old code is copied and rewritten with jumps to new code. the original code sections in the file is marked non-executible. STIR has better entropy than ASLR in location of code. Makes brute force attacks much harder. STIR runs on MS Windows (PEM) and Linux (ELF). It eliminated 99.96% or more "gadgets" (i.e., moved the address). Overhead usually 5-10% on MS Windows, about 1.5-4% on Linux (but some code actually runs faster!). The unique thing about STIR is it requires no source access and the modified binary fully works! Current work is to rewrite code to enforce security policies. For example, don't create a *.{exe,msi,bat} file. Or don't connect to the network after reading from the disk. Clowntown Express: interesting bugs and running a bug bounty program Collin Greene Collin Greene, Facebook Collin talked about Facebook's bug bounty program. Background at FB: FB has good security frameworks, such as security teams, external audits, and cc'ing on diffs. But there's lots of "deep, dark, forgotten" parts of legacy FB code. Collin gave several examples of bountied bugs. Some bounty submissions were on software purchased from a third-party (but bounty claimers don't know and don't care). We use security questions, as does everyone else, but they are basically insecure (often easily discoverable). Collin didn't expect many bugs from the bounty program, but they ended getting 20+ good bugs in first 24 hours and good submissions continue to come in. Bug bounties bring people in with different perspectives, and are paid only for success. Bug bounty is a better use of a fixed amount of time and money versus just code review or static code analysis. The Bounty program started July 2011 and paid out $1.5 million to date. 14% of the submissions have been high priority problems that needed to be fixed immediately. The best bugs come from a small % of submitters (as with everything else)—the top paid submitters are paid 6 figures a year. Spammers like to backstab competitors. The youngest sumitter was 13. Some submitters have been hired. Bug bounties also allows to see bugs that were missed by tools or reviews, allowing improvement in the process. Bug bounties might not work for traditional software companies where the product has release cycle or is not on Internet. Active Fingerprinting of Encrypted VPNs Anna Shubina Anna Shubina, Dartmouth Institute for Security, Technology, and Society (I missed the start of her talk because another track went overtime. But I have the DVD of the talk, so I'll expand later) IPsec leaves fingerprints. Using netcat, one can easily visually distinguish various crypto chaining modes just from packet timing on a chart (example, DES-CBC versus AES-CBC) One can tell a lot about VPNs just from ping roundtrips (such as what router is used) Delayed packets are not informative about a network, especially if far away from the network More needed to explore about how TCP works in real life with respect to timing Making Attacks Go Backwards Fuzzynop FuzzyNop, Mandiant This talk is not about threat attribution (finding who), product solutions, politics, or sales pitches. But who are making these malware threats? It's not a single person or group—they have diverse skill levels. There's a lot of fat-fingered fumblers out there. Always look for low-hanging fruit first: "hiding" malware in the temp, recycle, or root directories creation of unnamed scheduled tasks obvious names of files and syscalls ("ClearEventLog") uncleared event logs. Clearing event log in itself, and time of clearing, is a red flag and good first clue to look for on a suspect system Reverse engineering is hard. Disassembler use takes practice and skill. A popular tool is IDA Pro, but it takes multiple interactive iterations to get a clean disassembly. Key loggers are used a lot in targeted attacks. They are typically custom code or built in a backdoor. A big tip-off is that non-printable characters need to be printed out (such as "[Ctrl]" "[RightShift]") or time stamp printf strings. Look for these in files. Presence is not proof they are used. Absence is not proof they are not used. Java exploits. Can parse jar file with idxparser.py and decomile Java file. Java typially used to target tech companies. Backdoors are the main persistence mechanism (provided externally) for malware. Also malware typically needs command and control. Application of Artificial Intelligence in Ad-Hoc Static Code Analysis John Ashaman John Ashaman, Security Innovation Initially John tried to analyze open source files with open source static analysis tools, but these showed thousands of false positives. Also tried using grep, but tis fails to find anything even mildly complex. So next John decided to write his own tool. His approach was to first generate a call graph then analyze the graph. However, the problem is that making a call graph is really hard. For example, one problem is "evil" coding techniques, such as passing function pointer. First the tool generated an Abstract Syntax Tree (AST) with the nodes created from method declarations and edges created from method use. Then the tool generated a control flow graph with the goal to find a path through the AST (a maze) from source to sink. The algorithm is to look at adjacent nodes to see if any are "scary" (a vulnerability), using heuristics for search order. The tool, called "Scat" (Static Code Analysis Tool), currently looks for C# vulnerabilities and some simple PHP. Later, he plans to add more PHP, then JSP and Java. For more information see his posts in Security Innovation blog and NRefactory on GitHub. Mask Your Checksums—The Gorry Details Eric (XlogicX) Davisson Eric (XlogicX) Davisson Sometimes in emailing or posting TCP/IP packets to analyze problems, you may want to mask the IP address. But to do this correctly, you need to mask the checksum too, or you'll leak information about the IP. Problem reports found in stackoverflow.com, sans.org, and pastebin.org are usually not masked, but a few companies do care. If only the IP is masked, the IP may be guessed from checksum (that is, it leaks data). Other parts of packet may leak more data about the IP. TCP and IP checksums both refer to the same data, so can get more bits of information out of using both checksums than just using one checksum. Also, one can usually determine the OS from the TTL field and ports in a packet header. If we get hundreds of possible results (16x each masked nibble that is unknown), one can do other things to narrow the results, such as look at packet contents for domain or geo information. With hundreds of results, can import as CSV format into a spreadsheet. Can corelate with geo data and see where each possibility is located. Eric then demoed a real email report with a masked IP packet attached. Was able to find the exact IP address, given the geo and university of the sender. Point is if you're going to mask a packet, do it right. Eric wouldn't usually bother, but do it correctly if at all, to not create a false impression of security. Adventures with weird machines thirty years after "Reflections on Trusting Trust" Sergey Bratus Sergey Bratus, Dartmouth College (and Julian Bangert and Rebecca Shapiro, not present) "Reflections on Trusting Trust" refers to Ken Thompson's classic 1984 paper. "You can't trust code that you did not totally create yourself." There's invisible links in the chain-of-trust, such as "well-installed microcode bugs" or in the compiler, and other planted bugs. Thompson showed how a compiler can introduce and propagate bugs in unmodified source. But suppose if there's no bugs and you trust the author, can you trust the code? Hell No! There's too many factors—it's Babylonian in nature. Why not? Well, Input is not well-defined/recognized (code's assumptions about "checked" input will be violated (bug/vunerabiliy). For example, HTML is recursive, but Regex checking is not recursive. Input well-formed but so complex there's no telling what it does For example, ELF file parsing is complex and has multiple ways of parsing. Input is seen differently by different pieces of program or toolchain Any Input is a program input executes on input handlers (drives state changes & transitions) only a well-defined execution model can be trusted (regex/DFA, PDA, CFG) Input handler either is a "recognizer" for the inputs as a well-defined language (see langsec.org) or it's a "virtual machine" for inputs to drive into pwn-age ELF ABI (UNIX/Linux executible file format) case study. Problems can arise from these steps (without planting bugs): compiler linker loader ld.so/rtld relocator DWARF (debugger info) exceptions The problem is you can't really automatically analyze code (it's the "halting problem" and undecidable). Only solution is to freeze code and sign it. But you can't freeze everything! Can't freeze ASLR or loading—must have tables and metadata. Any sufficiently complex input data is the same as VM byte code Example, ELF relocation entries + dynamic symbols == a Turing Complete Machine (TM). @bxsays created a Turing machine in Linux from relocation data (not code) in an ELF file. For more information, see Rebecca "bx" Shapiro's presentation from last year's Toorcon, "Programming Weird Machines with ELF Metadata" @bxsays did same thing with Mach-O bytecode Or a DWARF exception handling data .eh_frame + glibc == Turning Machine X86 MMU (IDT, GDT, TSS): used address translation to create a Turning Machine. Page handler reads and writes (on page fault) memory. Uses a page table, which can be used as Turning Machine byte code. Example on Github using this TM that will fly a glider across the screen Next Sergey talked about "Parser Differentials". That having one input format, but two parsers, will create confusion and opportunity for exploitation. For example, CSRs are parsed during creation by cert requestor and again by another parser at the CA. Another example is ELF—several parsers in OS tool chain, which are all different. Can have two different Program Headers (PHDRs) because ld.so parses multiple PHDRs. The second PHDR can completely transform the executable. This is described in paper in the first issue of International Journal of PoC. Conclusions trusting computers not only about bugs! Bugs are part of a problem, but no by far all of it complex data formats means bugs no "chain of trust" in Babylon! (that is, with parser differentials) we need to squeeze complexity out of data until data stops being "code equivalent" Further information See and langsec.org. USENIX WOOT 2013 (Workshop on Offensive Technologies) for "weird machines" papers and videos.

Read the article
Accelerated C++, problem 5-6 (copying values from inside a vector to the front)

- by Darel

Hello, I'm working through the exercises in Accelerated C++ and I'm stuck on question 5-6. Here's the problem description: (somewhat abbreviated, I've removed extraneous info.) 5-6. Write the extract_fails function so that it copies the records for the passing students to the beginning of students, and then uses the resize function to remove the extra elements from the end of students. (students is a vector of student structures. student structures contain an individual student's name and grades.) More specifically, I'm having trouble getting the vector.insert function to properly copy the passing student structures to the start of the vector students. Here's the extract_fails function as I have it so far (note it doesn't resize the vector yet, as directed by the problem description; that should be trivial once I get past my current issue.) // Extract the students who failed from the "students" vector. void extract_fails(vector<Student_info>& students) { typedef vector<Student_info>::size_type str_sz; typedef vector<Student_info>::iterator iter; iter it = students.begin(); str_sz i = 0, count = 0; while (it != students.end()) { // fgrade tests wether or not the student failed if (!fgrade(*it)) { // if student passed, copy to front of vector students.insert(students.begin(), it, it); // tracks of the number of passing students(so we can properly resize the array) count++; } cout << it->name << endl; // output to verify that each student is iterated to it++; } } The code compiles and runs, but the students vector isn't adding any student structures to its front. My program's output displays that the students vector is unchanged. Here's my complete source code, followed by a sample input file (I redirect input from the console by typing " < grades" after the compiled program name at the command prompt.) #include <iostream> #include <string> #include <algorithm> // to get the declaration of `sort' #include <stdexcept> // to get the declaration of `domain_error' #include <vector> // to get the declaration of `vector' //driver program for grade partitioning examples using std::cin; using std::cout; using std::endl; using std::string; using std::domain_error; using std::sort; using std::vector; using std::max; using std::istream; struct Student_info { std::string name; double midterm, final; std::vector<double> homework; }; bool compare(const Student_info&, const Student_info&); std::istream& read(std::istream&, Student_info&); std::istream& read_hw(std::istream&, std::vector<double>&); double median(std::vector<double>); double grade(double, double, double); double grade(double, double, const std::vector<double>&); double grade(const Student_info&); bool fgrade(const Student_info&); void extract_fails(vector<Student_info>& v); int main() { vector<Student_info> vs; Student_info s; string::size_type maxlen = 0; while (read(cin, s)) { maxlen = max(maxlen, s.name.size()); vs.push_back(s); } sort(vs.begin(), vs.end(), compare); extract_fails(vs); // display the new, modified vector - it should be larger than // the input vector, due to some student structures being // added to the front of the vector. cout << "count: " << vs.size() << endl << endl; vector<Student_info>::iterator it = vs.begin(); while (it != vs.end()) cout << it++->name << endl; return 0; } // Extract the students who failed from the "students" vector. void extract_fails(vector<Student_info>& students) { typedef vector<Student_info>::size_type str_sz; typedef vector<Student_info>::iterator iter; iter it = students.begin(); str_sz i = 0, count = 0; while (it != students.end()) { // fgrade tests wether or not the student failed if (!fgrade(*it)) { // if student passed, copy to front of vector students.insert(students.begin(), it, it); // tracks of the number of passing students(so we can properly resize the array) count++; } cout << it->name << endl; // output to verify that each student is iterated to it++; } } bool compare(const Student_info& x, const Student_info& y) { return x.name < y.name; } istream& read(istream& is, Student_info& s) { // read and store the student's name and midterm and final exam grades is >> s.name >> s.midterm >> s.final; read_hw(is, s.homework); // read and store all the student's homework grades return is; } // read homework grades from an input stream into a `vector<double>' istream& read_hw(istream& in, vector<double>& hw) { if (in) { // get rid of previous contents hw.clear(); // read homework grades double x; while (in >> x) hw.push_back(x); // clear the stream so that input will work for the next student in.clear(); } return in; } // compute the median of a `vector<double>' // note that calling this function copies the entire argument `vector' double median(vector<double> vec) { typedef vector<double>::size_type vec_sz; vec_sz size = vec.size(); if (size == 0) throw domain_error("median of an empty vector"); sort(vec.begin(), vec.end()); vec_sz mid = size/2; return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid]; } // compute a student's overall grade from midterm and final exam grades and homework grade double grade(double midterm, double final, double homework) { return 0.2 * midterm + 0.4 * final + 0.4 * homework; } // compute a student's overall grade from midterm and final exam grades // and vector of homework grades. // this function does not copy its argument, because `median' does so for us. double grade(double midterm, double final, const vector<double>& hw) { if (hw.size() == 0) throw domain_error("student has done no homework"); return grade(midterm, final, median(hw)); } double grade(const Student_info& s) { return grade(s.midterm, s.final, s.homework); } // predicate to determine whether a student failed bool fgrade(const Student_info& s) { return grade(s) < 60; } Sample input file: Moo 100 100 100 100 100 100 100 100 Fail1 45 55 65 80 90 70 65 60 Moore 75 85 77 59 0 85 75 89 Norman 57 78 73 66 78 70 88 89 Olson 89 86 70 90 55 73 80 84 Peerson 47 70 82 73 50 87 73 71 Baker 67 72 73 40 0 78 55 70 Davis 77 70 82 65 70 77 83 81 Edwards 77 72 73 80 90 93 75 90 Fail2 55 55 65 50 55 60 65 60 Thanks to anyone who takes the time to look at this!

Read the article

Developer IT

yaakov davis - Page 8 - Developer IT

Search Results

Search found 186 results on 8 pages for 'yaakov davis'.

Page 8/8 | < Previous Page | 4 5 6 7 8

Your thoughts on Best Practices for Scientific Computing?

- by John Smith

Read the article

RSS Feeds currently on Simple-Talk

- by Andrew Clarke

Read the article

SQL SERVER – PAGELATCH_DT, PAGELATCH_EX, PAGELATCH_KP, PAGELATCH_SH, PAGELATCH_UP – Wait Type – Day 12 of 28

- by pinaldave

Read the article

Scripting out Contained Database Users

- by Argenis

Read the article

Linq to List and IEnumerable issues

- by Otaku

Read the article

Listing common SQL Code Smells.

- by Phil Factor

Read the article

Open source CMS for a university department

- by Greg Kuperberg

Read the article

GIT: clone works, remote push doesn't. Remote repository over copssh.

- by Rui

Read the article

Simple MSBuild Configuration: Updating Assemblies With A Version Number

- by srkirkland

Read the article

Toorcon 15 (2013)

- by danx

Read the article

Accelerated C++, problem 5-6 (copying values from inside a vector to the front)

- by Darel

Read the article

< Previous Page | 4 5 6 7 8