Search Results

Search found 26190 results on 1048 pages for 'py test'.

Page 115/1048 | < Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122 | Next Page >

sorl-thumbnail unit tests fail by 1 pixel (!)

- by stevejalim

Hi I'm using sorl-thumbnail in a Django 1.2 (currently 1.2 RC) project and getting a surprising failure of four of sorl's built-in unit tests. Essentially, the resized images are all 1px shorter than the unit tests expect them to be. See below for details I'm developing on OSX 10.5.8 (not Snow Leopard) with Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12) and PIL 1.1.6. Any thoughts what might be up? Cheers Steve ====================================================================== FAIL: test_extension (sorl.thumbnail.tests.fields.FieldTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/django/myprojectnamehere/lib/sorl/thumbnail/tests/fields.py", line 66, in test_extension self.verify_thumbnail((50, 37), thumb, expected_filename) File "/usr/local/django/myprojectnamehere/lib/sorl/thumbnail/tests/base.py", line 92, in verify_thumbnail self.assertEqual(image.size, expected_size) AssertionError: (50, 38) != (50, 37) ====================================================================== FAIL: test_thumbnail (sorl.thumbnail.tests.fields.ImageWithThumbnailsFieldTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/django/myprojectnamehere/lib/sorl/thumbnail/tests/fields.py", line 111, in test_thumbnail self.verify_thumbnail((50, 37), thumb, expected_filename) File "/usr/local/django/myprojectnamehere/lib/sorl/thumbnail/tests/base.py", line 92, in verify_thumbnail self.assertEqual(image.size, expected_size) AssertionError: (50, 38) != (50, 37) ====================================================================== FAIL: testTag (sorl.thumbnail.tests.templatetags.ThumbnailTagTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/django/myprojectnamehere/lib/sorl/thumbnail/tests/templatetags.py", line 118, in testTag self.verify_thumbnail((90, 67), expected_filename=expected_fn) File "/usr/local/django/myprojectnamehere/lib/sorl/thumbnail/tests/base.py", line 92, in verify_thumbnail self.assertEqual(image.size, expected_size) AssertionError: (90, 68) != (90, 67)

Read the article
Python regex look-behind requires fixed-width pattern

- by invictus

Hi When trying to extract the title of a html-page I have always used the following regex: (?<=<title.*>)([\s\S]*)(?=</title>) Which will extract everything between the tags in a document and ignore the tags themselves. However, when trying to use this regex in Python it raises the following Exception: Traceback (most recent call last): File "test.py", line 21, in pattern = re.compile('(?<=)([\s\S]*)(?=)') File "C:\Python31\lib\re.py", line 205, in compile return _compile(pattern, flags) File "C:\Python31\lib\re.py", line 273, in _compile p = sre_compile.compile(pattern, flags) File "C:\Python31\lib\sre_compile.py", line 495, in compile code = _code(p, flags) File "C:\Python31\lib\sre_compile.py", line 480, in _code _compile(code, p.data, flags) File "C:\Python31\lib\sre_compile.py", line 115, in _compile raise error("look-behind requires fixed-width pattern") sre_constants.error: look-behind requires fixed-width pattern The code I am using is: pattern = re.compile('(?<=<title.*>)([\s\S]*)(?=</title>)') m = pattern.search(f) if I do some minimal adjustments it works: pattern = re.compile('(?<=<title>)([\s\S]*)(?=</title>)') m = pattern.search(f) This will, however, not take into account potential html titles that for some reason have attributes or similar. Anyone know a good workaround for this issue? Any tips are appreciated.

Read the article
How do I recover from pushing a gitosis.conf file with parsing errors due to line breaks?

- by Kasia

I have successfully set up gitosis for an Android mirror (containing multiple git repositories). While adding a new .git path following writable= in gitosis.conf I managed to insert a few line breaks. Saved, committed and pushed to server when I received the following parsing error: Traceback (most recent call last): File "/usr/bin/gitosis-run-hook", line 8, in load_entry_point('gitosis==0.2', 'console_scripts', 'gitosis-run-hook')() File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/app.py", line 24, in run return app.main() File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/app.py", line 38, in main self.handle_args(parser, cfg, options, args) File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/run_hook.py", line 75, in handle_args post_update(cfg, git_dir) File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/run_hook.py", line 33, in post_update cfg.read(os.path.join(export, '..', 'gitosis.conf')) File "/usr/lib/python2.5/ConfigParser.py", line 267, in read self._read(fp, filename) File "/usr/lib/python2.5/ConfigParser.py", line 490, in _read raise e ConfigParser.ParsingError: File contains parsing errors: ./gitosis-export/../gitosis.conf (...) I have removed the line break and amendend the commit by git commit -m "fix linebreak" --amend However git push still yields the exact same error. It leads me to believe gitosis is preventing me from doing any further pushes. How do I recover from this?

Read the article
How to write a custom solution using a python package, modules etc

- by morpheous

I am writing a packacge foobar which consists of the modules alice, bob, charles and david. From my understanding of Python packages and modules, this means I will create a folder foobar, with the following subdirectories and files (please correct if I am wrong) foobar/ __init__.py alice/alice.py bob/bob.py charles/charles.py david/david.py The package should be executable, so that in addition to making the modules alice, bob etc available as 'libraries', I should also be able to use foobar in a script like this: python foobar --args=someargs Question1: Can a package be made executable and used in a script like I described above? Question 2 The various modules will use code that I want to refactor into a common library. Does that mean creating a new sub directory 'foobar/common' and placing common.py in that folder? Question 3 How will the modules foo import the common module ? Is it 'from foobar import common' or can I not use this since these modules are part of the package? Question 4 I want to add logic for when the foobar package is being used in a script (assuming this can be done - I have only seen it done for modules) The code used is something like: if __name__ == "__main__": dosomething() where (in which file) would I put this logic ?

Read the article
python-iptables: Cryptic error when allowing incoming TCP traffic on port 1234

- by Lucas Kauffman

I wanted to write an iptables script in Python. Rather than calling iptables itself I wanted to use the python-iptables package. However I'm having a hard time getting some basic rules setup. I wanted to use the filter chain to accept incoming TCP traffic on port 1234. So I wrote this: import iptc chain = iptc.Chain(iptc.TABLE_FILTER,"INPUT") rule = iptc.Rule() target = iptc.Target(rule,"ACCEPT") match = iptc.Match(rule,'tcp') match.dport='1234' rule.add_match(match) rule.target = target chain.insert_rule(rule) However when I run this I get this thrown back at me: Traceback (most recent call last): File "testing.py", line 9, in <module> chain.insert_rule(rule) File "/usr/local/lib/python2.6/dist-packages/iptc/__init__.py", line 1133, in insert_rule self.table.insert_entry(self.name, rbuf, position) File "/usr/local/lib/python2.6/dist-packages/iptc/__init__.py", line 1166, in new obj.refresh() File "/usr/local/lib/python2.6/dist-packages/iptc/__init__.py", line 1230, in refresh self._free() File "/usr/local/lib/python2.6/dist-packages/iptc/__init__.py", line 1224, in _free self.commit() File "/usr/local/lib/python2.6/dist-packages/iptc/__init__.py", line 1219, in commit raise IPTCError("can't commit: %s" % (self.strerror())) iptc.IPTCError: can't commit: Invalid argument Exception AttributeError: "'NoneType' object has no attribute 'get_errno'" in <bound method Table.__del__ of <iptc.Table object at 0x7fcad56cc550>> ignored Does anyone have experience with python-iptables that could enlighten on what I did wrong?

Read the article
Django caching seems to be causing problems

- by Issy

Hey guys, i have just implemented the Django Cache Local Memory back end in some my code, however it seems to be causing a problem. I get the following error when trying to view the site (With Debug On): Traceback (most recent call last): File "/usr/lib/python2.6/dist-packages/django/core/servers/basehttp.py", line 279, in run self.result = application(self.environ, self.start_response) File "/usr/lib/python2.6/dist-packages/django/core/servers/basehttp.py", line 651, in __call__ return self.application(environ, start_response) File "/usr/lib/python2.6/dist-packages/django/core/handlers/wsgi.py", line 245, in __call__ response = middleware_method(request, response) File "/usr/lib/python2.6/dist-packages/django/middleware/cache.py", line 91, in process_response patch_response_headers(response, timeout) File "/usr/lib/python2.6/dist-packages/django/utils/cache.py", line 112, in patch_response_headers response['Expires'] = http_date(time.time() + cache_timeout) TypeError: unsupported operand type(s) for +: 'float' and 'str' I have checked my code, for caching everything seems to be ok. For example, i have the following in my middleware. MIDDLEWARE_CLASSES = ( 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.middleware.cache.UpdateCacheMiddleware', 'django.middleware.common.CommonMiddleware', 'django.middleware.cache.FetchFromCacheMiddleware', 'django.contrib.flatpages.middleware.FlatpageFallbackMiddleware', ) My settings for Cache: CACHE_BACKEND = 'locmem://' CACHE_MIDDLEWARE_SECONDS = '3600' CACHE_MIDDLEWARE_KEY_PREFIX = 'za' CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True And some of my code (template tag): def get_featured_images(): """ provides featured images """ cache_key = 'featured_images' images = cache.get(cache_key) if images is None: images = FeaturedImage.objects.all().filter(enabled=True)[:5] cache.set(cache_key, images) return {'images': images} Any idea what could be the problem, from the error message below it looks like there's an issue in django's cache.py?

Read the article
Cannot run Python script on Windows with output redirected??

- by Wai Yip Tung

This is running on Windows 7 (64 bit), Python 2.6 with Win32 Extensions for Python. I have a simple script that just print "hello world". I can launch it with python hello.py. In this case I can redirect the output to a file. But if I run it by just typing hello.py on the command line and redirect the output, I get an exception. C:> python hello.py hello world C:> python hello.py >output C:> type output hello world C:> hello.py hello world C:> hello.py >output close failed in file object destructor: Error in sys.excepthook: Original exception was: I think I first get this error after upgrading to Windows 7. I remember it should work in XP. I have seen people talking about this bug python-Bugs-1012692 | Can't pipe input to a python program. But that was long time ago. And it does not mention any solution. Have anyone experienced this? Anyone can help?

Read the article
Apply a Quartz filter while saving PDF under Mac OS X 10.6.3

- by olpa

Using Mac OS X API, I'm trying to save a PDF file with a Quartz filter applied, just like it is possible from the "Save As" dialog in the Preview application. So far I've written the following code (using Python and pyObjC, but it isn't important for me): -- filter-pdf.py: begin from Foundation import * from Quartz import * import objc page_rect = CGRectMake (0, 0, 612, 792) fdict = NSDictionary.dictionaryWithContentsOfFile_("/System/Library/Filters/Blue \ Tone.qfilter") in_pdf = CGPDFDocumentCreateWithProvider(CGDataProviderCreateWithFilename ("test .pdf")) url = CFURLCreateWithFileSystemPath(None, "test_out.pdf", kCFURLPOSIXPathStyle, False) c = CGPDFContextCreateWithURL(url, page_rect, fdict) np = CGPDFDocumentGetNumberOfPages(in_pdf) for ip in range (1, np+1): page = CGPDFDocumentGetPage(in_pdf, ip) r = CGPDFPageGetBoxRect(page, kCGPDFMediaBox) CGContextBeginPage(c, r) CGContextDrawPDFPage(c, page) CGContextEndPage(c) -- filter-pdf.py: end Unfortunalte, the filter "Blue Tone" isn't applied, the output PDF looks exactly as the input PDF. Question: what I missed? How to apply a filter? Well, the documentation doesn't promise that such way of creating and using "fdict" should cause that the filter is applied. But I just rewritten (as far as I can) sample code /Developer/Examples/Quartz/Python/filter-pdf.py, which was distributed with older versions of Mac (meanwhile, this code doesn't work too): ----- filter-pdf-old.py: begin from CoreGraphics import * import sys, os, math, getopt, string def usage (): print ''' usage: python filter-pdf.py FILTER INPUT-PDF OUTPUT-PDF Apply a ColorSync Filter to a PDF document. ''' def main (): page_rect = CGRectMake (0, 0, 612, 792) try: opts,args = getopt.getopt (sys.argv[1:], '', []) except getopt.GetoptError: usage () sys.exit (1) if len (args) != 3: usage () sys.exit (1) filter = CGContextFilterCreateDictionary (args[0]) if not filter: print 'Unable to create context filter' sys.exit (1) pdf = CGPDFDocumentCreateWithProvider (CGDataProviderCreateWithFilename (args[1])) if not pdf: print 'Unable to open input file' sys.exit (1) c = CGPDFContextCreateWithFilename (args[2], page_rect, filter) if not c: print 'Unable to create output context' sys.exit (1) for p in range (1, pdf.getNumberOfPages () + 1): #r = pdf.getMediaBox (p) r = pdf.getPage(p).getBoxRect(p) c.beginPage (r) c.drawPDFDocument (r, pdf, p) c.endPage () c.finish () if __name__ == '__main__': main () ----- filter-pdf-old.py: end

Read the article
How should I launch a Portable Python Tkinter application on Windows without ugliness?

- by Andrew

I've written a simple GUI program in python using Tkinter. Let's call this program 'gui.py'. My users run 'gui.py' on Windows machines from a USB key using Portable Python; installing anything on the host machine is undesirable. I'd like my users to run 'gui.py' by double-clicking an icon at the root of the USB key. My users don't care what python is, and they don't want to use a command prompt if they don't have to. I don't want them to have to care what drive letter the USB key is assigned. I'd like this to work on XP, Vista, and 7. My first ugly solution was to create a shortcut in the root directory of the USB key, and set the "Target" property of the shortcut to something like "(root)\App\pythonw.exe (root)\App\gui.py", but I couldn't figure out how to do a relative path in a windows shortcut, and using an absolute path like "E:" seems fragile. My next solution was to create a .bat script in the root directory of the USB key, something like this: @echo off set basepath=%~dp0 "%basepath%App\pythonw.exe" "%basepath%\App\gui.py" This doesn't seem to care what drive letter the USB key is assigned, but it does leave a DOS window open while my program runs. Functional, but ugly. Next I tried a .bat script like this: @echo off set basepath=%~dp0 start "" "%basepath%App\pythonw.exe" "%basepath%\App\gui.py" (See here for an explanation of the funny quoting) Now, the DOS window briefly flashes on screen before my GUI opens. Less ugly! Still ugly. How do real men deal with this problem? What's the least ugly way to start a python Tkinter GUI on a Windows machine from a USB stick?

Read the article
OpenMeetings + Python + Suds

- by user366774

Trying to integrate openmeetings with django website, but can't understand how properly configure ImportDoctor: (here :// replaced with __ 'cause spam protection) print url http://sovershenstvo.com.ua:5080/openmeetings/services/UserService?wsdl imp = Import('http__schemas.xmlsoap.org/soap/encoding/') imp.filter.add('http__services.axis.openmeetings.org') imp.filter.add('http__basic.beans.hibernate.app.openmeetings.org/xsd') imp.filter.add('http__basic.beans.data.app.openmeetings.org/xsd') imp.filter.add('http__services.axis.openmeetings.org') d = ImportDoctor(imp) client = Client(url, doctor = d) client.service.getSession() Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/site-packages/suds/client.py", line 539, in call return client.invoke(args, kwargs) File "/usr/lib/python2.6/site-packages/suds/client.py", line 598, in invoke result = self.send(msg) File "/usr/lib/python2.6/site-packages/suds/client.py", line 627, in send result = self.succeeded(binding, reply.message) File "/usr/lib/python2.6/site-packages/suds/client.py", line 659, in succeeded r, p = binding.get_reply(self.method, reply) File "/usr/lib/python2.6/site-packages/suds/bindings/binding.py", line 159, in get_reply resolved = rtypes[0].resolve(nobuiltin=True) File "/usr/lib/python2.6/site-packages/suds/xsd/sxbasic.py", line 63, in resolve raise TypeNotFound(qref) suds.TypeNotFound: Type not found: '(Sessiondata, http__basic.beans.hibernate.app.openmeetings.org/xsd, )' what i'm doing wrong? please help and sorry for my english, but you are my last chance to save position :( need webinars at morning (2.26 am now)

Read the article
How do I structure my tests with Python unittest module?

- by persepolis

I'm trying to build a test framework for automated webtesting in selenium and unittest, and I want to structure my tests into distinct scripts. So I've organised it as following: base.py - This will contain, for now, the base selenium test case class for setting up a session. import unittest from selenium import webdriver # Base Selenium Test class from which all test cases inherit. class BaseSeleniumTest(unittest.TestCase): def setUp(self): self.browser = webdriver.Firefox() def tearDown(self): self.browser.close() main.py - I want this to be the overall test suite from which all the individual tests are run. import unittest import test_example if __name__ == "__main__": SeTestSuite = test_example.TitleSpelling() unittest.TextTestRunner(verbosity=2).run(SeTestSuite) test_example.py - An example test case, it might be nice to make these run on their own too. from base import BaseSeleniumTest # Test the spelling of the title class TitleSpelling(BaseSeleniumTest): def test_a(self): self.assertTrue(False) def test_b(self): self.assertTrue(True) The problem is that when I run main.py I get the following error: Traceback (most recent call last): File "H:\Python\testframework\main.py", line 5, in <module> SeTestSuite = test_example.TitleSpelling() File "C:\Python27\lib\unittest\case.py", line 191, in __init__ (self.__class__, methodName)) ValueError: no such test method in <class 'test_example.TitleSpelling'>: runTest I suspect this is due to the very special way in which unittest runs and I must have missed a trick on how the docs expect me to structure my tests. Any pointers?

Read the article
how to solve a weired swig python c++ interfacing type error

- by user2981648

I want to use swig to switch a simple cpp function to python and use "scipy.integrate.quadrature" function to calculate the integration. But python 2.7 reports a type error. Do you guys know what is going on here? Thanks a lot. Furthermore, "scipy.integrate.quad" runs smoothly. So is there something special for "scipy.integrate.quadrature" function? The code is in the following: File "testfunctions.h": #ifndef TESTFUNCTIONS_H #define TESTFUNCTIONS_H double test_square(double x); #endif File "testfunctions.cpp": #include "testfunctions.h" double test_square(double x) { return x * x; } File "swig_test.i" : /* File : swig_test.i */ %module swig_test %{ #include "testfunctions.h" %} /* Let's just grab the original header file here */ %include "testfunctions.h" File "test.py": import scipy.integrate import _swig_test print scipy.integrate.quadrature(_swig_test.test_square, 0., 1.) error info: UMD has deleted: _swig_test Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 523, in runfile execfile(filename, namespace) File "D:\data\haitaliu\Desktop\Projects\swig_test\Release\test.py", line 4, in <module> print scipy.integrate.quadrature(_swig_test.test_square, 0., 1.) File "C:\Python27\lib\site-packages\scipy\integrate\quadrature.py", line 161, in quadrature newval = fixed_quad(vfunc, a, b, (), n)[0] File "C:\Python27\lib\site-packages\scipy\integrate\quadrature.py", line 61, in fixed_quad return (b-a)/2.0*sum(w*func(y,*args),0), None File "C:\Python27\lib\site-packages\scipy\integrate\quadrature.py", line 90, in vfunc return func(x, *args) TypeError: in method 'test_square', argument 1 of type 'double'

Read the article
Celery Received unregistered task of type (run example)

- by Echeg

I'm trying to run example from Celery documentation. I run: celeryd --loglevel=INFO /usr/local/lib/python2.7/dist-packages/celery/loaders/default.py:64: NotConfigured: No 'celeryconfig' module found! Please make sure it exists and is available to Python. "is available to Python." % (configname, ))) [2012-03-19 04:26:34,899: WARNING/MainProcess] -------------- celery@ubuntu v2.5.1 ---- **** ----- --- * *** * -- [Configuration] -- * - **** --- . broker: amqp://guest@localhost:5672// - ** ---------- . loader: celery.loaders.default.Loader - ** ---------- . logfile: [stderr]@INFO - ** ---------- . concurrency: 4 - ** ---------- . events: OFF - *** --- * --- . beat: OFF -- ******* ---- --- ***** ----- [Queues] -------------- . celery: exchange:celery (direct) binding:celery tasks.py: # -*- coding: utf-8 -*- from celery.task import task @task def add(x, y): return x + y run_task.py: # -*- coding: utf-8 -*- from tasks import add result = add.delay(4, 4) print (result) print (result.ready()) print (result.get()) In same folder celeryconfig.py: CELERY_IMPORTS = ("tasks", ) CELERY_RESULT_BACKEND = "amqp" BROKER_URL = "amqp://guest:guest@localhost:5672//" CELERY_TASK_RESULT_EXPIRES = 300 When I run "run_task.py": on python console eb503f77-b5fc-44e2-ac0b-91ce6ddbf153 False errors on celeryd server [2012-03-19 04:34:14,913: ERROR/MainProcess] Received unregistered task of type 'tasks.add'. The message has been ignored and discarded. Did you remember to import the module containing this task? Or maybe you are using relative imports? Please see http://bit.ly/gLye1c for more information. The full contents of the message body was: {'retries': 0, 'task': 'tasks.add', 'utc': False, 'args': (4, 4), 'expires': None, 'eta': None, 'kwargs': {}, 'id': '841bc21f-8124-436b-92f1-e3b62cafdfe7'} Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 444, in receive_message self.strategies[name](message, body, message.ack_log_error) KeyError: 'tasks.add' Please explain what's the problem.

Read the article
Why doesn't `stdin.read()` read entire buffer?

- by Shookie

I've got the following code: def get_input(self): """ Reads command from stdin, returns its JSON form """ json_string = sys.stdin.read() print("json string is: "+json_string) json_data =json.loads(json_string) return json_data It reads a json string that was sent to it from another process. The json is read from stdin. For some reason I get the following output: json string is: <Some json here> json string is: Traceback (most recent call last): File "/Users/Matan/Documents/workspace/ProjectSH/addonmanager/addon_manager.py", line 63, in <module> manager.accept_commands() File "/Users/Matan/Documents/workspace/ProjectSH/addonmanager/addon_manager.py", line 49, in accept_commands json_data = self.get_input() File "/Users/Matan/Documents/workspace/ProjectSH/addonmanager/addon_manager.py", line 42, in get_input json_data =json.loads(json_string) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 383, in raw_decode raise ValueError("No JSON object could be decoded") So for some reason it reads an empty string from stdin instead of reading only the json. I've checked, and the code that writes to this process's stdin writes to it only once. What's wrong here?

Read the article
Are file access times not properly maintained in Mac OS X?

- by Ether

I'm trying to determine how file access times are maintained by default in Mac OS X, as I'm trying to diagnose some odd behaviour I'm seeing in a new MBP Unibody (running Snow Leopard, 10.6.2): The symptoms (drilling down to the specific behaviour that seems to be causing the issue): mutt is unable to switch to mailboxes which have recently received new mail mail is delivered by procmail, which updates the mtime of the mbox folder it is updating, but does not alter the atime (this is how new mail detection works: by comparing atime to mtime) however, both the mtime and atime of the mbox file is getting updated Through testing, it does not appear that atimes can be set separately in the filesystem: : [ether@tequila ~]$; touch test : [ether@tequila ~]$; touch -m -t 200801010000 test2 : [ether@tequila ~]$; touch -a -t 200801010000 test3 : [ether@tequila ~]$; ls -l test* -rw------- 1 ether staff 0 Dec 30 11:42 test -rw------- 1 ether staff 0 Jan 1 2008 test2 -rw------- 1 ether staff 0 Dec 30 11:43 test3 : [ether@tequila ~]$; ls -lu test* -rw------- 1 ether staff 0 Dec 30 11:42 test -rw------- 1 ether staff 0 Dec 30 11:43 test2 -rw------- 1 ether staff 0 Dec 30 11:43 test3 The test2 file is created with an old mtime, and the atime is set to now (as it is a new file), which is correct. However, test3 is created with an old atime, but is not set properly on the file. To be sure this is not just behaviour seen with new files, let's modify an old file: : [ether@tequila ~]$; touch -a -t 200801010000 test : [ether@tequila ~]$; ls -l test -rw------- 1 ether staff 0 Dec 30 11:42 test : [ether@tequila ~]$; ls -lu test -rw------- 1 ether staff 0 Dec 30 11:45 test So it would seem that atimes cannot be set explicitly (it is always reset to "now" when either mtime or atime modifications are submitted). Is this something inherent to the filesystem itself, is it something that can be changed, or am I totally crazy and looking in the wrong place? PS. the output of mount is: : [ether@tequila ~]$; mount /dev/disk0s2 on / (hfs, local, journaled) devfs on /dev (devfs, local, nobrowse) map -hosts on /net (autofs, nosuid, automounted, nobrowse) map auto_home on /home (autofs, automounted, nobrowse) ...and Disk Utility says that the drive is of type "Mac OS Extended (Journaled)".

Read the article
Not sure about ACL permissions

- by Darko Miletic

I'm writing up something about ACL usage on CentOS but since I still do not have a box ready I would like to ask something. Let us assume we have a folder /var/www/test If I do this in terms of permissions: /bin/chown -R root:root /var/www/test/ /bin/chmod -R u=rwx,go= /var/www/test/ /usr/bin/setfacl -R -m u:apache:rwx /var/www/test/ Will user apache be able to change owner of folder test or of any particular file within that folder? If answer is yes shall I than use group instead of user?

Read the article
Mysteriously empty $_POST array

- by Lex

Hi all! I have the following HTML/PHP page: <?php if(empty($_SERVER['CONTENT_TYPE'])) { $type = "application/x-www-form-urlencoded"; $_SERVER['CONTENT_TYPE'] = $type; } echo "<pre>"; var_dump($_POST); var_dump(file_get_contents("php://input")); echo "</pre>"; ?> <form method="post" action="test.php"> <input type="text" name="test[1]" /> <input type="text" name="test[2]" /> <input type="text" name="test[3]" /> <input type="submit" name="action" value="Go" /> </form> As you can see, the form will submit and the expected output is a POST array with one array in it containing the filled in values and one entry "action" with the value "Go" (the button). However, no matter what values I enter in the fields; the result is always: array(2) { ["test"]=> string(0) "" ["action"]=> string(2) "Go" } string(16) "test=&action=Go&" Somehow, the array named test is emptied, the "action" variable does make it through. I've used the Live HTTP Headers extension for Firefox to check whether the POST fields get submitted, and they do. The relevant information from Live HTTP Headers (with a, b and c filled in as values in the textboxes): Content-Type: application/x-www-form-urlencoded Content-Length: 51 test%5B1%5D=a&test%5B2%5D=b&test%5B3%5D=c&action=Go Does anybody have any idea as to why this is happening? I'm freaking out on this one, it has cost me so much time already... EDIT: We've tried this on different servers, on Windows boxes it does work, on the Ubuntu server with PHP version 5.2.4 (with Suhosin), it doesn't. It even works on a different server, also with Ubuntu and the same PHP version, also with Suhosin installed.

Read the article
How to time batch file execution using timethis.exe?

- by unknown

While timethis.exe works fine for almost every application, it seems to fail for .bat files: C:\test>timethis test.bat TimeThis : Command Line : test.bat TimeThis : Start Time : Fri Feb 26 19:46:30 2010 'test.bat' is not recognized as an internal or external command, operable program or batch file. TimeThis : Command Line : test.bat TimeThis : Start Time : Fri Feb 26 19:46:30 2010 TimeThis : End Time : Fri Feb 26 19:46:30 2010 TimeThis : Elapsed Time : 00:00:00.070 While executing it on a regular command line is fine, timethis.exe fails for it. How do I fix this problem?

Read the article
Guidance: A Branching strategy for Scrum Teams

- by Martin Hinshelwood

Having a good branching strategy will save your bacon, or at least your code. Be careful when deviating from your branching strategy because if you do, you may be worse off than when you started! This is one possible branching strategy for Scrum teams and I will not be going in depth with Scrum but you can find out more about Scrum by reading the Scrum Guide and you can even assess your Scrum knowledge by having a go at the Scrum Open Assessment. You can also read SSW’s Rules to Better Scrum using TFS which have been developed during our own Scrum implementations. Acknowledgements Bill Heys – Bill offered some good feedback on this post and helped soften the language. Note: Bill is a VS ALM Ranger and co-wrote the Branching Guidance for TFS 2010 Willy-Peter Schaub – Willy-Peter is an ex Visual Studio ALM MVP turned blue badge and has been involved in most of the guidance including the Branching Guidance for TFS 2010 Chris Birmele – Chris wrote some of the early TFS Branching and Merging Guidance. Dr Paul Neumeyer, Ph.D Parallel Processes, ScrumMaster and SSW Solution Architect – Paul wanted to have feature branches coming from the release branch as well. We agreed that this is really a spin-off that needs own project, backlog, budget and Team. Scenario: A product is developed RTM 1.0 is released and gets great sales. Extra features are demanded but the new version will have double to price to pay to recover costs, work is approved by the guys with budget and a few sprints later RTM 2.0 is released. Sales a very low due to the pricing strategy. There are lots of clients on RTM 1.0 calling out for patches. As I keep getting Reverse Integration and Forward Integration mixed up and Bill keeps slapping my wrists I thought I should have a reminder: You still seemed to use reverse and/or forward integration in the wrong context. I would recommend reviewing your document at the end to ensure that it agrees with the common understanding of these terms merge (forward integration) from parent to child (same direction as the branch), and merge (reverse integration) from child to parent (the reverse direction of the branch). - one of my many slaps on the wrist from Bill Heys. As I mentioned previously we are using a single feature branching strategy in our current project. The single biggest mistake developers make is developing against the “Main” or “Trunk” line. This ultimately leads to messy code as things are added and never finished. Your only alternative is to NEVER check in unless your code is 100%, but this does not work in practice, even with a single developer. Your ADD will kick in and your half-finished code will be finished enough to pass the build and the tests. You do use builds don’t you? Sadly, this is a very common scenario and I have had people argue that branching merely adds complexity. Then again I have seen the other side of the universe ... branching structures from he... We should somehow convince everyone that there is a happy between no-branching and too-much-branching. - Willy-Peter Schaub, VS ALM Ranger, Microsoft A key benefit of branching for development is to isolate changes from the stable Main branch. Branching adds sanity more than it adds complexity. We do try to stress in our guidance that it is important to justify a branch, by doing a cost benefit analysis. The primary cost is the effort to do merges and resolve conflicts. A key benefit is that you have a stable code base in Main and accept changes into Main only after they pass quality gates, etc. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft The second biggest mistake developers make is branching anything other than the WHOLE “Main” line. If you branch parts of your code and not others it gets out of sync and can make integration a nightmare. You should have your Source, Assets, Build scripts deployment scripts and dependencies inside the “Main” folder and branch the whole thing. Some departments within MSFT even go as far as to add the environments used to develop the product in there as well; although I would not recommend that unless you have a massive SQL cluster to house your source code. We tried the “add environment” back in South-Africa and while it was “phenomenal”, especially when having to switch between environments, the disk storage and processing requirements killed us. We opted for virtualization to skin this cat of keeping a ready-to-go environment handy. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I think people often think that you should have separate branches for separate environments (e.g. Dev, Test, Integration Test, QA, etc.). I prefer to think of deploying to environments (such as from Main to QA) rather than branching for QA). - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft You can read about SSW’s Rules to better Source Control for some additional information on what Source Control to use and how to use it. There are also a number of branching Anti-Patterns that should be avoided at all costs: You know you are on the wrong track if you experience one or more of the following symptoms in your development environment: Merge Paranoia—avoiding merging at all cost, usually because of a fear of the consequences. Merge Mania—spending too much time merging software assets instead of developing them. Big Bang Merge—deferring branch merging to the end of the development effort and attempting to merge all branches simultaneously. Never-Ending Merge—continuous merging activity because there is always more to merge. Wrong-Way Merge—merging a software asset version with an earlier version. Branch Mania—creating many branches for no apparent reason. Cascading Branches—branching but never merging back to the main line. Mysterious Branches—branching for no apparent reason. Temporary Branches—branching for changing reasons, so the branch becomes a permanent temporary workspace. Volatile Branches—branching with unstable software assets shared by other branches or merged into another branch. Note Branches are volatile most of the time while they exist as independent branches. That is the point of having them. The difference is that you should not share or merge branches while they are in an unstable state. Development Freeze—stopping all development activities while branching, merging, and building new base lines. Berlin Wall—using branches to divide the development team members, instead of dividing the work they are performing. -Branching and Merging Primer by Chris Birmele - Developer Tools Technical Specialist at Microsoft Pty Ltd in Australia In fact, this can result in a merge exercise no-one wants to be involved in, merging hundreds of thousands of change sets and trying to get a consolidated build. Again, we need to find a happy medium. - Willy-Peter Schaub on Merge Paranoia Merge conflicts are generally the result of making changes to the same file in both the target and source branch. If you create merge conflicts, you will eventually need to resolve them. Often the resolution is manual. Merging more frequently allows you to resolve these conflicts close to when they happen, making the resolution clearer. Waiting weeks or months to resolve them, the Big Bang approach, means you are more likely to resolve conflicts incorrectly. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft Figure: Main line, this is where your stable code lives and where any build has known entities, always passes and has a happy test that passes as well? Many development projects consist of, a single “Main” line of source and artifacts. This is good; at least there is source control . There are however a couple of issues that need to be considered. What happens if: you and your team are working on a new set of features and the customer wants a change to his current version? you are working on two features and the customer decides to abandon one of them? you have two teams working on different feature sets and their changes start interfering with each other? I just use labels instead of branches? That's a lot of “what if’s”, but there is a simple way of preventing this. Branching… In TFS, labels are not immutable. This does not mean they are not useful. But labels do not provide a very good development isolation mechanism. Branching allows separate code sets to evolve separately (e.g. Current with hotfixes, and vNext with new development). I don’t see how labels work here. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft Figure: Creating a single feature branch means you can isolate the development work on that branch. Its standard practice for large projects with lots of developers to use Feature branching and you can check the Branching Guidance for the latest recommendations from the Visual Studio ALM Rangers for other methods. In the diagram above you can see my recommendation for branching when using Scrum development with TFS 2010. It consists of a single Sprint branch to contain all the changes for the current sprint. The main branch has the permissions changes so contributors to the project can only Branch and Merge with “Main”. This will prevent accidental check-ins or checkouts of the “Main” line that would contaminate the code. The developers continue to develop on sprint one until the completion of the sprint. Note: In the real world, starting a new Greenfield project, this process starts at Sprint 2 as at the start of Sprint 1 you would have artifacts in version control and no need for isolation. Figure: Once the sprint is complete the Sprint 1 code can then be merged back into the Main line. There are always good practices to follow, and one is to always do a Forward Integration from Main into Sprint 1 before you do a Reverse Integration from Sprint 1 back into Main. In this case it may seem superfluous, but this builds good muscle memory into your developer’s work ethic and means that no bad habits are learned that would interfere with additional Scrum Teams being added to the Product. The process of completing your sprint development: The Team completes their work according to their definition of done. Merge from “Main” into “Sprint1” (Forward Integration) Stabilize your code with any changes coming from other Scrum Teams working on the same product. If you have one Scrum Team this should be quick, but there may have been bug fixes in the Release branches. (we will talk about release branches later) Merge from “Sprint1” into “Main” to commit your changes. (Reverse Integration) Check-in Delete the Sprint1 branch Note: The Sprint 1 branch is no longer required as its useful life has been concluded. Check-in Done But you are not yet done with the Sprint. The goal in Scrum is to have a “potentially shippable product” at the end of every Sprint, and we do not have that yet, we only have finished code. Figure: With Sprint 1 merged you can create a Release branch and run your final packaging and testing In 99% of all projects I have been involved in or watched, a “shippable product” only happens towards the end of the overall lifecycle, especially when sprints are short. The in-between releases are great demonstration releases, but not shippable. Perhaps it comes from my 80’s brain washing that we only ship when we reach the agreed quality and business feature bar. - Willy-Peter Schaub, VS ALM Ranger, Microsoft Although you should have been testing and packaging your code all the way through your Sprint 1 development, preferably using an automated process, you still need to test and package with stable unchanging code. This is where you do what at SSW we call a “Test Please”. This is first an internal test of the product to make sure it meets the needs of the customer and you generally use a resource external to your Team. Then a “Test Please” is conducted with the Product Owner to make sure he is happy with the output. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client? Figure: If you find a deviation from the expected result you fix it on the Release branch. If during your final testing or your “Test Please” you find there are issues or bugs then you should fix them on the release branch. If you can’t fix them within the time box of your Sprint, then you will need to create a Bug and put it onto the backlog for prioritization by the Product owner. Make sure you leave plenty of time between your merge from the development branch to find and fix any problems that are uncovered. This process is commonly called Stabilization and should always be conducted once you have completed all of your User Stories and integrated all of your branches. Even once you have stabilized and released, you should not delete the release branch as you would with the Sprint branch. It has a usefulness for servicing that may extend well beyond the limited life you expect of it. Note: Don't get forced by the business into adding features into a Release branch instead that indicates the unspoken requirement is that they are asking for a product spin-off. In this case you can create a new Team Project and branch from the required Release branch to create a new Main branch for that product. And you create a whole new backlog to work from. Figure: When the Team decides it is happy with the product you can create a RTM branch. Once you have fixed all the bugs you can, and added any you can’t to the Product Backlog, and you Team is happy with the result you can create a Release. This would consist of doing the final Build and Packaging it up ready for your Sprint Review meeting. You would then create a read-only branch that represents the code you “shipped”. This is really an Audit trail branch that is optional, but is good practice. You could use a Label, but Labels are not Auditable and if a dispute was raised by the customer you can produce a verifiable version of the source code for an independent party to check. Rare I know, but you do not want to be at the wrong end of a legal battle. Like the Release branch the RTM branch should never be deleted, or only deleted according to your companies legal policy, which in the UK is usually 7 years. Figure: If you have made any changes in the Release you will need to merge back up to Main in order to finalise the changes. Nothing is really ever done until it is in Main. The same rules apply when merging any fixes in the Release branch back into Main and you should do a reverse merge before a forward merge, again for the muscle memory more than necessity at this stage. Your Sprint is now nearly complete, and you can have a Sprint Review meeting knowing that you have made every effort and taken every precaution to protect your customer’s investment. Note: In order to really achieve protection for both you and your client you would add Automated Builds, Automated Tests, Automated Acceptance tests, Acceptance test tracking, Unit Tests, Load tests, Web test and all the other good engineering practices that help produce reliable software. Figure: After the Sprint Planning meeting the process begins again. Where the Sprint Review and Retrospective meetings mark the end of the Sprint, the Sprint Planning meeting marks the beginning. After you have completed your Sprint Planning and you know what you are trying to achieve in Sprint 2 you can create your new Branch to develop in. How do we handle a bug(s) in production that can’t wait? Although in Scrum the only work done should be on the backlog there should be a little buffer added to the Sprint Planning for contingencies. One of these contingencies is a bug in the current release that can’t wait for the Sprint to finish. But how do you handle that? Willy-Peter Schaub asked an excellent question on the release activities: In reality Sprint 2 starts when sprint 1 ends + weekend. Should we not cater for a possible parallelism between Sprint 2 and the release activities of sprint 1? It would introduce FI’s from main to sprint 2, I guess. Your “Figure: Merging print 2 back into Main.” covers, what I tend to believe to be reality in most cases. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I agree, and if you have a single Scrum team then your resources are limited. The Scrum Team is responsible for packaging and release, so at least one run at stabilization, package and release should be included in the Sprint time box. If more are needed on the current production release during the Sprint 2 time box then resource needs to be pulled from Sprint 2. The Product Owner and the Team have four choices (in order of disruption/cost): Backlog: Add the bug to the backlog and fix it in the next Sprint Buffer Time: Use any buffer time included in the current Sprint to fix the bug quickly Make time: Remove a Story from the current Sprint that is of equal value to the time lost fixing the bug(s) and releasing. Note: The Team must agree that it can still meet the Sprint Goal. Cancel Sprint: Cancel the sprint and concentrate all resource on fixing the bug(s) Note: This can be a very costly if the current sprint has already had a lot of work completed as it will be lost. The choice will depend on the complexity and severity of the bug(s) and both the Product Owner and the Team need to agree. In this case we will go with option #2 or #3 as they are uncomplicated but severe bugs. Figure: Real world issue where a bug needs fixed in the current release. If the bug(s) is urgent enough then then your only option is to fix it in place. You can edit the release branch to find and fix the bug, hopefully creating a test so it can’t happen again. Follow the prior process and conduct an internal and customer “Test Please” before releasing. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client? Figure: After you have fixed the bug you need to ship again. You then need to again create an RTM branch to hold the version of the code you released in escrow. Figure: Main is now out of sync with your Release. We now need to get these new changes back up into the Main branch. Do a reverse and then forward merge again to get the new code into Main. But what about the branch, are developers not working on Sprint 2? Does Sprint 2 now have changes that are not in Main and Main now have changes that are not in Sprint 2? Well, yes… and this is part of the hit you take doing branching. But would this scenario even have been possible without branching? Figure: Getting the changes in Main into Sprint 2 is very important. The Team now needs to do a Forward Integration merge into their Sprint and resolve any conflicts that occur. Maybe the bug has already been fixed in Sprint 2, maybe the bug no longer exists! This needs to be identified and resolved by the developers before they continue to get further out of Sync with Main. Note: Avoid the “Big bang merge” at all costs. Figure: Merging Sprint 2 back into Main, the Forward Integration, and R0 terminates. Sprint 2 now merges (Reverse Integration) back into Main following the procedures we have already established. Figure: The logical conclusion. This then allows the creation of the next release. By now you should be getting the big picture and hopefully you learned something useful from this post. I know I have enjoyed writing it as I find these exploratory posts coupled with real world experience really help harden my understanding. Branching is a tool; it is not a silver bullet. Don’t over use it, and avoid “Anti-Patterns” where possible. Although the diagram above looks complicated I hope showing you how it is formed simplifies it as much as possible. Technorati Tags: Branching,Scrum,VS ALM,TFS 2010,VS2010

Read the article
error while running ruby application at system startup in ubuntu

- by anjo

I am on Ubuntu 12.04 machine. Have a script file which runs when entered manually in terminal gnome-terminal -e /home/precise/Desktop/cartodb/script.sh The content of script file is cd /home/ubuntupc/Desktop/cartodb20/ sh /home/ubuntupc/.rvm/scripts/rvm bundle exec foreman start -p 3000 So what i tried to do is to run this script at every system start up. So on Startup Applications command: gnome-terminal -e /home/precise/Desktop/cartodb/script.sh On terminal Edit - Profile Preferences - Title and Command Checked the "Run command as a login shell" But this seems to be not working. When restarted the machine found these error in terminal The child process exited normally with status 127. ERROR: RVM Ruby not used, run `rvm use ruby` first. Some info regarding the installed packages and system. $ which ruby /home/ubuntupc/.rvm/rubies/ruby-1.9.2-p320/bin/ruby $ which rails /home/ubuntupc/.rvm/gems/ruby-1.9.2-p320/bin/rails $ which gem /home/ubuntupc/.rvm/rubies/ruby-1.9.2-p320/bin/gem $ cat ~/.bash_profile [[ -s "$HOME/.profile" ]] && source "$HOME/.profile" # Load the default .profile [[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm" # Load RVM into a shell session *as a function* $ which -a ruby /home/ubuntupc/.rvm/rubies/ruby-1.9.2-p320/bin/ruby $ sudo update-alternatives --config ruby update-alternatives: error: no alternatives for ruby. $ sudo find / -name "rubygems" -print /home/ubuntupc/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/site_ruby/1.9.1/rubygems /home/ubuntupc/.rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/rubygems /home/ubuntupc/.rvm/src/ruby-1.9.2-p320/lib/rubygems /home/ubuntupc/.rvm/src/ruby-1.9.2-p320/test/rubygems /home/ubuntupc/.rvm/src/ruby-1.9.2-p320/test/rubygems/rubygems /home/ubuntupc/.rvm/src/ruby-1.9.2-p320/doc/rubygems /home/ubuntupc/.rvm/src/rubygems-2.2.1/lib/rubygems /home/ubuntupc/.rvm/src/rubygems-2.2.1/test/rubygems /home/ubuntupc/.rvm/src/rubygems-2.2.1/test/rubygems/rubygems /home/ubuntupc/.rvm/src/rvm/scripts/functions/rubygems /home/ubuntupc/.rvm/src/rvm/scripts/rubygems /home/ubuntupc/.rvm/scripts/functions/rubygems /home/ubuntupc/.rvm/scripts/rubygems /usr/lib/ruby/1.9.1/rubygems /usr/local/rvm/rubies/ruby-1.9.2-p320/lib/ruby/site_ruby/1.9.1/rubygems /usr/local/rvm/rubies/ruby-1.9.2-p320/lib/ruby/1.9.1/rubygems /usr/local/rvm/src/ruby-1.9.2-p320/lib/rubygems /usr/local/rvm/src/ruby-1.9.2-p320/test/rubygems /usr/local/rvm/src/ruby-1.9.2-p320/test/rubygems/rubygems /usr/local/rvm/src/ruby-1.9.2-p320/doc/rubygems /usr/local/rvm/src/rubygems-2.2.0/lib/rubygems /usr/local/rvm/src/rubygems-2.2.0/test/rubygems /usr/local/rvm/src/rubygems-2.2.0/test/rubygems/rubygems /usr/local/rvm/src/rvm/scripts/functions/rubygems /usr/local/rvm/src/rvm/scripts/rubygems /usr/local/rvm/scripts/functions/rubygems /usr/local/rvm/scripts/rubygems Please point out what i am missing as i am new to the ruby applications. Thanks in advance

Read the article
Advanced TSQL Tuning: Why Internals Knowledge Matters

- by Paul White

There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes. Query tuning is not complete as soon as the query returns results quickly in the development or test environments. In production, your query will compete for memory, CPU, locks, I/O and other resources on the server. Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data. In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row. The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type. A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL, CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL, CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table. The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results. This seems slow, considering that the table only has 50,000 rows. Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time. Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring. The script below monitors STATISTICS IO output and the amount of tempdb used by the test query. We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB. The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row. With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first). This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts. In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted. SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed. Sorts typically require a very large number of comparisons, so this is usually a very effective optimization. One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself. SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use. The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory. Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant. The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources. More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1). Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB. The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file. Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case. In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms. If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much. Why is the data size so much smaller? The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored. By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output. You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead. SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows. The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes. The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page. There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option. By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row. SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage. You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now. This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL, CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query). SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant. No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers. Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’. Why? Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed. SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages. This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this? Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need. 245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory. Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily? That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column. That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal. I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator. Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted. We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need. The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb. The small remaining inefficiency is in reading the id column values from the clustered primary key index. As a clustered index, it contains all the in-row data at its leaf. The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead. The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure. This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only. This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data. The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings. It isn’t about the internal differences between TEXT and MAX data types either. It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load. No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance. Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows. 5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is! If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb. Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough. Tuning for logical reads and adding covering indexes is not enough. If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on. The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008. They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator. Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

Read the article
SQL – Migrate Database from SQL Server to NuoDB – A Quick Tutorial

- by Pinal Dave

Data is growing exponentially and every organization with growing data is thinking of next big innovation in the world of Big Data. Big data is a indeed a future for every organization at one point of the time. Just like every other next big thing, big data has its own challenges and issues. The biggest challenge associated with the big data is to find the ideal platform which supports the scalability and growth of the data. If you are a regular reader of this blog, you must be familiar with NuoDB. I have been working with NuoDB for a while and their recent release is the best thus far. NuoDB is an elastically scalable SQL database that can run on local host, datacenter and cloud-based resources. A key feature of the product is that it does not require sharding (read more here). Last week, I was able to install NuoDB in less than 90 seconds and have explored their Explorer and Admin sections. You can read about my experiences in these posts: SQL – Step by Step Guide to Download and Install NuoDB – Getting Started with NuoDB SQL – Quick Start with Admin Sections of NuoDB – Manage NuoDB Database SQL – Quick Start with Explorer Sections of NuoDB – Query NuoDB Database Many SQL Authority readers have been following me in my journey to evaluate NuoDB. One of the frequently asked questions I’ve received from you is if there is any way to migrate data from SQL Server to NuoDB. The fact is that there is indeed a way to do so and NuoDB provides a fantastic tool which can help users to do it. NuoDB Migrator is a command line utility that supports the migration of Microsoft SQL Server, MySQL, Oracle, and PostgreSQL schemas and data to NuoDB. The migration to NuoDB is a three-step process: NuoDB Migrator generates a schema for a target NuoDB database It loads data into the target NuoDB database It dumps data from the source database Let’s see how we can migrate our data from SQL Server to NuoDB using a simple three-step approach. But before we do that we will create a sample database in MSSQL and later we will migrate the same database to NuoDB: Setup Step 1: Build a sample data CREATE DATABASE [Test]; CREATE TABLE [Department]( [DepartmentID] [smallint] NOT NULL, [Name] VARCHAR(100) NOT NULL, [GroupName] VARCHAR(100) NOT NULL, [ModifiedDate] [datetime] NOT NULL, CONSTRAINT [PK_Department_DepartmentID] PRIMARY KEY CLUSTERED ( [DepartmentID] ASC ) ) ON [PRIMARY]; INSERT INTO Department SELECT * FROM AdventureWorks2012.HumanResources.Department; Note that I am using the SQL Server AdventureWorks database to build this sample table but you can build this sample table any way you prefer. Setup Step 2: Install Java 64 bit Before you can begin the migration process to NuoDB, make sure you have 64-bit Java installed on your computer. This is due to the fact that the NuoDB Migrator tool is built in Java. You can download 64-bit Java for Windows, Mac OSX, or Linux from the following link: http://java.com/en/download/manual.jsp. One more thing to remember is that you make sure that the path in your environment settings is set to your JAVA_HOME directory or else the tool will not work. Here is how you can do it: Go to My Computer >> Right Click >> Select Properties >> Click on Advanced System Settings >> Click on Environment Variables >> Click on New and enter the following values. Variable Name: JAVA_HOME Variable Value: C:\Program Files\Java\jre7 Make sure you enter your Java installation directory in the Variable Value field. Setup Step 3: Install JDBC driver for SQL Server. There are two JDBC drivers available for SQL Server. Select the one you prefer to use by following one of the two links below: Microsoft JDBC Driver jTDS JDBC Driver In this example we will be using jTDS JDBC driver. Once you download the driver, move the driver to your NuoDB installation folder. In my case, I have moved the JAR file of the driver into the C:\Program Files\NuoDB\tools\migrator\jar folder as this is my NuoDB installation directory. Now we are all set to start the three-step migration process from SQL Server to NuoDB: Migration Step 1: NuoDB Schema Generation Here is the command I use to generate a schema of my SQL Server Database in NuoDB. First I go to the folder C:\Program Files\NuoDB\tools\migrator\bin and execute the nuodb-migrator.bat file. Note that my database name is ‘test’. Additionally my username and password is also ‘test’. You can see that my SQL Server database is running on my localhost on port 1433. Additionally, the schema of the table is ‘dbo’. nuodb-migrator schema –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.path=/tmp/schema.sql The above script will generate a schema of all my SQL Server tables and will put it in the folder C:\tmp\schema.sql . You can open the schema.sql file and execute this file directly in your NuoDB instance. You can follow the link here to see how you can execute the SQL script in NuoDB. Please note that if you have not yet created the schema in the NuoDB database, you should create it before executing this step. Step 2: Generate the Dump File of the Data Once you have recreated your schema in NuoDB from SQL Server, the next step is very easy. Here we create a CSV format dump file, which will contain all the data from all the tables from the SQL Server database. The command to do so is very similar to the above command. Be aware that this step may take a bit of time based on your database size. nuodb-migrator dump –source.driver=net.sourceforge.jtds.jdbc.Driver –source.url=jdbc:jtds:sqlserver://localhost:1433/ –source.username=test –source.password=test –source.catalog=test –source.schema=dbo –output.type=csv –output.path=/tmp/dump.cat Once the above command is successfully executed you can find your CSV file in the C:\tmp\ folder. However, you do not have to do anything manually. The third and final step will take care of completing the migration process. Migration Step 3: Load the Data into NuoDB After building schema and taking a dump of the data, the very next step is essential and crucial. It will take the CSV file and load it into the NuoDB database. nuodb-migrator load –target.url=jdbc:com.nuodb://localhost:48004/mytest –target.schema=dbo –target.username=test –target.password=test –input.path=/tmp/dump.cat Please note that in the above script we are now targeting the NuoDB database, which we have already created with the name of “MyTest”. If the database does not exist, create it manually before executing the above script. I have kept the username and password as “test”, but please make sure that you create a more secure password for your database for security reasons. Voila! You’re Done That’s it. You are done. It took 3 setup and 3 migration steps to migrate your SQL Server database to NuoDB. You can now start exploring the database and build excellent, scale-out applications. In this blog post, I have done my best to come up with simple and easy process, which you can follow to migrate your app from SQL Server to NuoDB. Download NuoDB I strongly encourage you to download NuoDB and go through my 3-step migration tutorial from SQL Server to NuoDB. Additionally here are two very important blog post from NuoDB CTO Seth Proctor. He has written excellent blog posts on the concept of the Administrative Domains. NuoDB has this concept of an Administrative Domain, which is a collection of hosts that can run one or multiple databases. Each database has its own TEs and SMs, but all are managed within the Admin Console for that particular domain. http://www.nuodb.com/techblog/2013/03/11/getting-started-provisioning-a-domain/ http://www.nuodb.com/techblog/2013/03/14/getting-started-running-a-database/ Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

Read the article
What’s new in VS.10 & TFS.10?

- by johndoucette

Getting my geek on… I have decided to call the products VS.10 (Visual Studio 2010), TP.10 (Test Professional 2010), and TFS.10 (Team Foundation Server 2010) Thanks Neno Loje. What's new in Visual Studio & Team Foundation Server 2010? Focusing on Visual Studio Team System (VSTS) ALM-related parts: Visual Studio Ultimate 2010 NEW: IntelliTrace® (aka the historical debugger) NEW: Architecture Tools New Project Type: Modeling Project UML Diagrams UML Use Case Diagram UML Class Diagram UML Sequence Diagram (supports reverse enginneering) UML Activity Diagram UML Component Diagram Layer Diagram (with Team Build integration for layer validation) Architecuture Explorer Dependency visualization DGML Web & Load Tests Visual Studio Premium 2010 NEW: Architecture Tools Read-only model viewer Development Tools Code Analysis New Rules like SQL Injection detection Rule Sets Code Profiler Multi-Tier Profiling JScript Profiling Profiling applications on virtual machines in sampling mode Code Metrics Test Tools Code Coverage NEW: Test Impact Analysis NEW: Coded UI Test Database Tools (DB schema versioning & deployment) Visual Studio Professional 2010 Debuger Mixed Mode Debugging for 64-bit Applications Export/Import of Breakpoints and data tips Visual Studio Test Professional 2010 Microsoft Test Manager (MTM, formerly known as "Camano")) Fast Forward Testing Visual Studio Team Foundation Server 2010 Work Item Tracking and Project Management New MSF templatesfor Agile and CMMI (V 5.0) Hierarchical Work Items Custom Work Item Link Types Ready to use Excel agile project management workbooks for managing your backlogs (including capacity planing) Convert Work Item query to an Excel report MS Excel integration Support for Work Item hierarchies Formatting is preserved after doing a 'Refresh' MS Project integration Hierarchy and successor/predecessor info is now synchronized NEW: Test Case Management Version Control Public Workspaces Branch & Merge Visualization Tracking of Changesets & Work Items Gated Check-In Team Build Build Controllers and Agents Workflow 4-based build process NEW: Lab Management (only a pre-release is avaiable at the moment!) Project Portal & Reporting Dashboards (on SharePoint Portal) Burndown Chart TFS Web Parts (to show data from TFS) Administration & Operations Topology enhancements Application tier network load balancing (NLB) SQL Server scale out Improved Sharepoint flexibility Report Server flexibility Zone support Kerberos support Separation of TFS and SQL administration Setup Separate install from configure Improved installation wizards Optional components Simplified account requirements Improved Reporting Services configuration Setup consolidation Upgrading from previous TFS versions Improved IIS flexibility Administration Consolidation of command line tools User rename support Project Collections Archive/restore individual project collections Move Team Project Collections Server consolidation Team Project Collection Split Team Project Collection Isolation Server request cancellation Licensing: TFS server license included in MSDN subscriptions Removed features (former features not part of Visual Studio 2010): Debug » Start With Application Verifier Object Test Bench IntelliSense for C++ / CLI Debugging support for SQL 2000

Read the article
How do I make code bound to an ORM testable?

- by RPK

In Test Driven Development, how do I make code bound to an ORM testable? I am using a Micro-ORM (PetaPoco) and I have several methods that interact with the database like: AddCustomer UpdateRecord etc. I want to know how to write a test for these methods. I searched YouTube for videos on writing a test for DAL, but I didn't find any. I want to know which method or class is testable and how to write a test before writing the code itself.

Read the article
What type of code is suitable for unit testing?

- by RPK

In Test Driven Development, what type of code is testable? I am using a Micro-ORM (PetaPoco) and I have several methods that interact with the database like: AddCustomer UpdateRecord etc. I want to know how to write a test for these methods. I searched YouTube for videos on writing a test for DAL, but I didn't find any. I want to know which method or class is testable and how to write a test before writing the code itself.

Read the article

< Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122 | Next Page >