Search Results

Search found 2372 results on 95 pages for 'significant whitespace'.

Page 20/95 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

Backlinks For Your Website

One of the most significant factors to a successful website is incoming links or backlinks. The more backlinks you have, the more chances that you will attain an increase page rank value.

Read the article
Apache log rotation: logrotate vs rotatelogs vs chronolog

- by Enrico

I have been researching log rotation for my server which hosts ~5 fairly high traffic sites. From what I can tell, my options are to use logrotate or to use piped logging with either rotatelogs or chronolog. logrotate requires a restart of apache and both SIGHUP and SIGUSR1 restarts are less than ideal on high traffic sites, because either you drop a bunch of connections or you need to delay compressing the old log until all child processes have died naturally. Also, downtime can be quite significant if compression is enabled. Would using logrotate - without compression and with graceful restart - and compressing old logs after the fact be the best way to minimize downtime? chronolog and rotatelogs sound promising, but are not well documented. I couldn't find examples of using either in combination with vhost specific logs. The chronolog website says, "when the expanded filename changes, the current file is closed and a new one opened". Is this globally? Or is that per AccessLog, CustomLog or ErrorLog directive? Is there a significant difference between chronolog and rotatelogs?

Read the article
Possible DNS issue after a reinstall of Windows Server 2000 (get off my lawn)

- by cop1152

I just replaced a drive on a Win2000 Server that replicates AD and issues out DHCP at one of our offices. I successfully joined it to the domain, setup range of IP's, etc, but am still having issues. I cannot RDC to it with name or IP. I can ping it, browse to it with Windows Explorer, and remote to it with some other software, but not RDC. The other issue is this: Users are unable to authenticate on it. They receive the message 'username or password incorrect' (or something like that). Changes made on the main domain controller seem to take forever to trickle down. The most significant entry in the DNS Server Log is Event ID 7062: The DNS Server Encountered a Packet Addressed to Itself. At least, I think its significant. The Directory Services Log shows numerous Event IDs 1265: The attempt to establish a replication link with parameters failed with the following status: The DSA operation is unable to proceed because of a DNS lookup failure. Does this make any sense to anyone? I feel like its something very simple that I am overlooking. Thanks in advance.

Read the article
What is the oldest hardware still in production use? How is it kept running?

- by sleske

In the spirit of the question What is your oldest hardware that still works?, I'd like to ask: What is the oldest hardware you know that is still in production use? And what challenges did you (or someone else) face in keeping it running (scarce documentation, no support, no spare parts available...)? Most organizations will retire / upgrade software and hardware after 5-10 years, but sometimes old software is kept running on old boxes, because it "just works". I once worked at a client site that was running a critical piece of (in-house developed) business software on a single server running HP-UX. The server was old (ca. 12-13 years), but fortunately still running without problems; however, getting spares would have been very difficult, and since software installation was undocumented, any significant system changes or even new hardware might have caused significant downtime and data loss. We eventually managed to replace it, but this is not always possible. I also read that many organizations still run decade-old mainframe hardware, particularly for highly customized systems controlling industrial machines or power plants. Which old hardware have you encountered? How did you manage these challenges? Related question: http://serverfault.com/questions/82467/should-old-servers-be-retired

Read the article
How can a code editor effectively hint at code nesting level - without using indentation?

- by pgfearo

I'm writing an XML text editor that provides 2 view options for the same XML text, one indented (virtually), the other left-justified. The motivation for the left-justified view is to help users 'see' the whitespace characters they're using for indentation of plain-text or XPath code without interference from indentation that is an automated side-effect of the XML context. I want to provide visual clues (in the non-editable part of the editor) for the left-justified mode that will help the user, but without getting too elaborate. I tried just using connecting lines, but that seemed too busy. The best I've come up with so far is shown in a mocked up screenshot of the editor below, but I'm seeking better/simpler alternatives (that don't require too much code). [Edit] Taking the heatmap idea (from: @jimp) I get something like this: or even these alternates:

Read the article
The Case of the Extra Page: Rendering Reporting Services as PDF

- by smisner

I had to troubleshoot a problem with a mysterious extra page appearing in a PDF this week. My first thought was that it was likely to caused by one of the most common problems that people encounter when developing reports that eventually get rendered as PDF is getting blank pages inserted into the PDF document. The cause of the blank pages is usually related to sizing. You can learn more at Understanding Pagination in Reporting Services in Books Online. When designing a report, you have to be really careful with the layout of items in the body. As you move items around, the body will expand to accommodate the space you're using and you might eventually tighten everything back up again, but the body doesn't automatically collapse. One of my favorite things to do in Reporting Services 2005 - which I dubbed the "vacu-pack" method - was to just erase the size property of the Body and let it auto-calculate the new size, squeezing out all the extra space. Alas, that method no longer works beginning with Reporting Services 2008. Even when you make sure the body size is as small as possible (with no unnecessary extra space along the top, bottom, left, or right side of the body), it's important to calculate the body size plus header plus footer plus the margins and ensure that the calculated height and width do not exceed the report's height and width (shown as the page in the illustration above). This won't matter if users always render reports online, but they'll get extra pages in a PDF document if the report's height and width are smaller than the calculate space. Beginning the Investigation In the situation that I was troubleshooting, I checked the properties: Item Property Value Body Height 6.25in Width 10.5in Page Header Height 1in Page Footer Height 0.25in Report Left Margin 0.1in Right Margin 0.1in Top Margin 0.05in Bottom Margin 0.05in Page Size - Height 8.5in Page Size - Width 11in So I calculated the total width using Body Width + Left Margin + Right Margin and came up with a value of 10.7 inches. And then I calculated the total height using Body Height + Page Header Height + Page Footer Height + Top Margin + Bottom Margin and got 7.6 inches. Well, page sizing couldn't be the reason for the extra page in my report because 10.7 inches is smaller than the report's width of 11 inches and 7.6 inches is smaller than the report's height of 8.5 inches. I had to look elsewhere to find the culprit. Conducting the Third Degree My next thought was to focus on the rendering size of the items in the report. I've adapted my problem to use the Adventure Works database. At the top of the report are two charts, and then below each chart is a rectangle that contains a table. In the real-life scenario, there were some graphics present as a background for the tables which fit within the rectangles that were about 3 inches high so the visual space of the rectangles matched the visual space of the charts - also about 3 inches high. But there was also a huge amount of white space at the bottom of the page, and as I mentioned at the beginning of this post, a second page which was blank except for the footer that appeared at the bottom. Placing a textbox beneath the rectangles to see if they would appear on the first page resulted the textbox's appearance on the second page. For some reason, the rectangles wanted a buffer zone beneath them. What's going on? Taking the Suspect into Custody My next step was to see what was really going on with the rectangle. The graphic appeared to be correctly sized, but the behavior in the report indicated the rectangle was growing. So I added a border to the rectangle to see what it was doing. When I added borders, I could see that the size of each rectangle was growing to accommodate the table it contains. The rectangle on the right is slightly larger than the one on the left because the table on the right contains an extra row. The rectangle is trying to preserve the whitespace that appears in the layout, as shown below. Closing the Case Now that I knew what the problem was, what could I do about it? Because of the graphic in the rectangle (not shown), I couldn't eliminate the use of the rectangles and just show the tables. But fortunately, there is a report property that comes to the rescue: ConsumeContainerWhitespace (accessible only in the Properties window). I set the value of this property to True. Problem solved. Now the rectangles remain fixed at the configured size and don't grow vertically to preserve the whitespace. Case closed.

Read the article
Wise settings for Git

- by Marko Apfel

These settings reflecting my Git-environment. It a result of reading and trying several ideas of input from others. Must-Haves Aliases [alias] ci = commit st = status co = checkout oneline = log --pretty=oneline br = branch la = log --pretty=\"format:%ad %h (%an): %s\" --date=short df = diff dc = diff --cached lg = log -p lol = log --graph --decorate --pretty=oneline --abbrev-commit lola = log --graph --decorate --pretty=oneline --abbrev-commit --all ls = ls-files ign = ls-files -o -i --exclude-standard Colors [color] ui = auto [color "branch"] current = yellow reverse local = yellow remote = green [color "diff"] meta = yellow bold frag = magenta bold old = red bold new = green bold whitespace = red reverse [color "status"] added = green changed = red untracked = cyan Core [core] autocrlf = true excludesfile = c:/Users/<user>/.gitignore editor = 'C:/Program Files (x86)/Notepad++/notepad++.exe' -multiInst -notabbar -nosession –noPlugin Nice to have Merge and Diff [merge] tool = kdiff3 [mergetool "kdiff3"] path = c:/Program Files (x86)/KDiff3/kdiff3.exe [mergetool "p4merge"] path = c:/Program Files (x86)/Perforce Merge/p4merge.exe cmd = p4merge \"$BASE\" \"$LOCAL\" \"$REMOTE\" \"$MERGED\" keepTemporaries = false trustExitCode = false keepBackup = false [diff] guitool = kdiff3 [difftool "kdiff3"] path = c:/Program Files (x86)/KDiff3/kdiff3.exe [difftool "p4merge"] path = C:/Users/<user>/My Applications/Perforce Merge/p4merge.exe cmd = \"p4merge.exe $LOCAL $REMOTE\" .

Read the article
Make Apache encode or replace quotes instead of escaping them?

- by mplungjan

In the dcoumentation I read Format Notes For security reasons, starting with version 2.0.46, non-printable and other special characters in %r, %i and %o are escaped using \xhh sequences, where hh stands for the hexadecimal representation of the raw byte. Exceptions from this rule are " and \, which are escaped by prepending a backslash, and all whitespace characters, which are written in their C-style notation (\n, \t, etc). In versions prior to 2.0.46, no escaping was performed on these strings so you had to be quite careful when dealing with raw log files. This is a problem for Analog which is still the handiest analyser I use. I get .... "GET /somerequest?q=\"quoted string\"&someparm=bla" in the logfile and it is of course flagged as corrupt since Analog expects .... "GET /somerequest?q=%22quoted string%22&someparm=bla" or similar. I realise I can pre-process using something like perl -p -i.bak -e 's/\\"/%22/g' logfile But I'd rather not have to add this step to these files which are 50-90MB zipped per day Thanks for any pointers

Read the article
ECMAScript : ajouts et modifications de la 5e édition qui introduisent des incompatibilités avec le 3e

(une petite traduction rapide des changements impactant entre la v5 et la v3. les numéros correspondent aux chapitres de la norme) Ecma-362 alias EcmaScript5 Annex E Ajouts et modifications dans la 5e édition qui introduisent des incompatibilités avec le 3e édition 7.1: si des caractères de contrôle Unicode sont présent dans une expression String ou une expression d'ExpressionRegulière ils seront inclus dans la l'expression. alors que dans l'édition 3 ils étaient ignorés. 7.2: le caractère Unicode <BOM> (Byte Order Mark) est maintenant traité comme un whitespace alors qu'il provoquait une Syntax Error dans l'édition ...

Read the article
Why don't we store the syntax tree instead of the source code?

- by Calmarius

We have a lot of programming languages. Every language is parsed and syntax checked before translated into code so an abstract syntax tree is built. We have this abstract syntax tree, why don't we store this syntax tree instead of the source code (or next to the source code)? By using an AST instead of the source code. Every programmer in a team can serialize this tree to any language, they want (with the appropriate context free grammar) and parse back to AST when they finished. So this would eliminate the debate about the coding style questions (where to put the { and }, where to put whitespace, indentation, etc.) What are the pros and cons of this approach?

Read the article
Linq-to-XML query to select specific sub-element based on additional criteria

- by BrianLy

My current LINQ query and example XML are below. What I'd like to do is select the primary email address from the email-addresses element into the User.Email property. The type element under the email-address element is set to primary when this is true. There may be more than one element under the email-addresses but only one will be marked primary. What is the simplest approach to take here? Current Linq Query (User.Email is currently empty): var users = from response in xdoc.Descendants("response") where response.Element("id") != null select new User { Id = (string)response.Element("id"), Name = (string)response.Element("full-name"), Email = (string)response.Element("email-addresses"), JobTitle = (string)response.Element("job-title"), NetworkId = (string)response.Element("network-id"), Type = (string)response.Element("type") }; Example XML: <?xml version="1.0" encoding="UTF-8"?> <response> <response> <contact> <phone-numbers/> <im> <provider></provider> <username></username> </im> <email-addresses> <email-address> <type>primary</type> <address>[email protected]</address> </email-address> </email-addresses> </contact> <job-title>Account Manager</job-title> <type>user</type> <expertise nil="true"></expertise> <summary nil="true"></summary> <kids-names nil="true"></kids-names> <location nil="true"></location> <guid nil="true"></guid> <timezone>Eastern Time (US & Canada)</timezone> <network-name>Domain</network-name> <full-name>Alice</full-name> <network-id>79629</network-id> <stats> <followers>2</followers> <updates>4</updates> <following>3</following> </stats> <mugshot-url> https://assets3.yammer.com/images/no_photo_small.gif</mugshot-url> <previous-companies/> <birth-date></birth-date> <name>alice</name> <web-url>https://www.yammer.com/domain.com/users/alice</web-url> <interests nil="true"></interests> <state>active</state> <external-urls/> <url>https://www.yammer.com/api/v1/users/1089943</url> <network-domains> <network-domain>domain.com</network-domain> </network-domains> <id>1089943</id> <schools/> <hire-date nil="true"></hire-date> <significant-other nil="true"></significant-other> </response> <response> <contact> <phone-numbers/> <im> <provider></provider> <username></username> </im> <email-addresses> <email-address> <type>primary</type> <address>[email protected]</address> </email-address> </email-addresses> </contact> <job-title>Office Manager</job-title> <type>user</type> <expertise nil="true"></expertise> <summary nil="true"></summary> <kids-names nil="true"></kids-names> <location nil="true"></location> <guid nil="true"></guid> <timezone>Eastern Time (US & Canada)</timezone> <network-name>Domain</network-name> <full-name>Bill</full-name> <network-id>79629</network-id> <stats> <followers>3</followers> <updates>1</updates> <following>1</following> </stats> <mugshot-url> https://assets3.yammer.com/images/no_photo_small.gif</mugshot-url> <previous-companies/> <birth-date></birth-date> <name>bill</name> <web-url>https://www.yammer.com/domain.com/users/bill</web-url> <interests nil="true"></interests> <state>active</state> <external-urls/> <url>https://www.yammer.com/api/v1/users/1089920</url> <network-domains> <network-domain>domain.com</network-domain> </network-domains> <id>1089920</id> <schools/> <hire-date nil="true"></hire-date> <significant-other nil="true"></significant-other> </response> </response>

Read the article
Standards Corner: OAuth WG Client Registration Problem

- by Tanu Sood

Phil Hunt is an active member of multiple industry standards groups and committees (see brief bio at the end of the post) and has spearheaded discussions, creation and ratifications of Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-ascii- mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi- mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} industry standards including the Kantara Identity Governance Framework, among others. Being an active voice in the industry standards development world, we have invited him to share his discussions, thoughts, news & updates, and discuss use cases, implementation success stories (and even failures) around industry standards on this monthly column. Author: Phil Hunt This afternoon, the OAuth Working Group will meet at IETF88 in Vancouver to discuss some important topics important to the maturation of OAuth. One of them is the OAuth client registration problem.OAuth (RFC6749) was initially developed with a simple deployment model where there is only monopoly or singleton cloud instance of a web API (e.g. there is one Facebook, one Google, on LinkedIn, and so on). When the API publisher and API deployer are the same monolithic entity, it easy for developers to contact the provider and register their app to obtain a client_id and credential.But what happens when the API is for an open source project where there may be 1000s of deployed copies of the API (e.g. such as wordpress). In these cases, the authors of the API are not the people running the API. In these scenarios, how does the developer obtain a client_id? An example of an "open deployed" API is OpenID Connect. Connect defines an OAuth protected resource API that can provide personal information about an authenticated user -- in effect creating a potentially common API for potential identity providers like Facebook, Google, Microsoft, Salesforce, or Oracle. In Oracle's case, Fusion applications will soon have RESTful APIs that are deployed in many different ways in many different environments. How will developers write apps that can work against an openly deployed API with whom the developer can have no prior relationship?At present, the OAuth Working Group has two proposals two consider: Dynamic RegistrationDynamic Registration was originally developed for OpenID Connect and UMA. It defines a RESTful API in which a prospective client application with no client_id creates a new client registration record with a service provider and is issued a client_id and credential along with a registration token that can be used to update registration over time.As proof of success, the OIDC community has done substantial implementation of this spec and feels committed to its use. Why not approve?Well, the answer is that some of us had some concerns, namely: Recognizing instances of software - dynamic registration treats all clients as unique. It has no defined way to recognize that multiple copies of the same client are being registered other then assuming if the registration parameters are similar it might be the same client. Versioning and Policy Approval of open APIs and clients - many service providers have to worry about change management. They expect to have approval cycles that approve versions of server and client software for use in their environment. In some cases approval might be wide open, but in many cases, approval might be down to the specific class of software and version. Registration updates - when does a client actually need to update its registration? Shouldn't it be never? Is there some characteristic of deployed code that would cause it to change? Options lead to complexity - because each client is treated as unique, it becomes unclear how the clients and servers will agree on what credentials forms are acceptable and what OAuth features are allowed and disallowed. Yet the reality is, developers will write their application to work in a limited number of ways. They can't implement all the permutations and combinations that potential service providers might choose. Stateful registration - if the primary motivation for registration is to obtain a client_id and credential, why can't this be done in a stateless fashion using assertions? Denial of service - With so much stateful registration and the need for multiple tokens to be issued, will this not lead to a denial of service attack / risk of resource depletion? At the very least, because of the information gathered, it would difficult for service providers to clean up "failed" registrations and determine active from inactive or false clients. There has yet to be much wide-scale "production" use of dynamic registration other than in small closed communities. Client Association A second proposal, Client Association, has been put forward by Tony Nadalin of Microsoft and myself. We took at look at existing use patterns to come up with a new proposal. At the Berlin meeting, we considered how WS-STS systems work. More recently, I took a review of how mobile messaging clients work. I looked at how Apple, Google, and Microsoft each handle registration with APNS, GCM, and WNS, and a similar pattern emerges. This pattern is to use an existing credential (mutual TLS auth), or client bearer assertion and swap for a device specific bearer assertion.In the client association proposal, the developer's registration with the API publisher is handled by having the developer register with an API publisher (as opposed to the party deploying the API) and obtaining a software "statement". Or, if there is no "publisher" that can sign a statement, the developer may include their own self-asserted software statement.A software statement is a special type of assertion that serves to lock application registration profile information in a signed assertion. The statement is included with the client application and can then be used by the client to swap for an instance specific client assertion as defined by section 4.2 of the OAuth Assertion draft and profiled in the Client Association draft. The software statement provides a way for service provider to recognize and configure policy to approve classes of software clients, and simplifies the actual registration to a simple assertion swap. Because the registration is an assertion swap, registration is no longer "stateful" - meaning the service provider does not need to store any information to support the client (unless it wants to). Has this been implemented yet? Not directly. We've only delivered draft 00 as an alternate way of solving the problem using well-known patterns whose security characteristics and scale characteristics are well understood. Dynamic Take II At roughly the same time that Client Association and Software Statement were published, the authors of Dynamic Registration published a "split" version of the Dynamic Registration (draft-richer-oauth-dyn-reg-core and draft-richer-oauth-dyn-reg-management). While some of the concerns above are addressed, some differences remain. Registration is now a simple POST request. However it defines a new method for issuing client tokens where as Client Association uses RFC6749's existing extension point. The concern here is whether future client access token formats would be addressed properly. Finally, Dyn-reg-core does not yet support software statements. Conclusion The WG has some interesting discussion to bring this back to a single set of specifications. Dynamic Registration has significant implementation, but Client Association could be a much improved way to simplify implementation of the overall OpenID Connect specification and improve adoption. In fairness, the existing editors have already come a long way. Yet there are those with significant investment in the current draft. There are many that have expressed they don't care. They just want a standard. There is lots of pressure on the working group to reach consensus quickly.And that folks is how the sausage is made.Note: John Bradley and Justin Richer recently published draft-bradley-stateless-oauth-client-00 which on first look are getting closer. Some of the details seem less well defined, but the same could be said of client-assoc and software-statement. I hope we can merge these specs this week. Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Calibri","sans-serif"; mso-ascii- mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi- mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} About the Writer: Phil Hunt joined Oracle as part of the November 2005 acquisition of OctetString Inc. where he headed software development for what is now Oracle Virtual Directory. Since joining Oracle, Phil works as CMTS in the Identity Standards group at Oracle where he developed the Kantara Identity Governance Framework and provided significant input to JSR 351. Phil participates in several standards development organizations such as IETF and OASIS working on federation, authorization (OAuth), and provisioning (SCIM) standards. Phil blogs at www.independentid.com and a Twitter handle of @independentid.

Read the article
Solaris X86 64-bit Assembly Programming

- by danx

Solaris X86 64-bit Assembly Programming This is a simple example on writing, compiling, and debugging Solaris 64-bit x86 assembly language with a C program. This is also referred to as "AMD64" assembly. The term "AMD64" is used in an inclusive sense to refer to all X86 64-bit processors, whether AMD Opteron family or Intel 64 processor family. Both run Solaris x86. I'm keeping this example simple mainly to illustrate how everything comes together—compiler, assembler, linker, and debugger when using assembly language. The example I'm using here is a C program that calls an assembly language program passing a C string. The assembly language program takes the C string and calls printf() with it to print the string. AMD64 Register Usage But first let's review the use of AMD64 registers. AMD64 has several 64-bit registers, some special purpose (such as the stack pointer) and others general purpose. By convention, Solaris follows the AMD64 ABI in register usage, which is the same used by Linux, but different from Microsoft Windows in usage (such as which registers are used to pass parameters). This blog will only discuss conventions for Linux and Solaris. The following chart shows how AMD64 registers are used. The first six parameters to a function are passed through registers. If there's more than six parameters, parameter 7 and above are pushed on the stack before calling the function. The stack is also used to save temporary "stack" variables for use by a function. 64-bit Register Usage %rip Instruction Pointer points to the current instruction %rsp Stack Pointer %rbp Frame Pointer (saved stack pointer pointing to parameters on stack) %rdi Function Parameter 1 %rsi Function Parameter 2 %rdx Function Parameter 3 %rcx Function Parameter 4 %r8 Function Parameter 5 %r9 Function Parameter 6 %rax Function return value %r10, %r11 Temporary registers (need not be saved before used) %rbx, %r12, %r13, %r14, %r15 Temporary registers, but must be saved before use and restored before returning from the current function (usually with the push and pop instructions). 32-, 16-, and 8-bit registers To access the lower 32-, 16-, or 8-bits of a 64-bit register use the following: 64-bit register Least significant 32-bits Least significant 16-bits Least significant 8-bits %rax%eax%ax%al %rbx%ebx%bx%bl %rcx%ecx%cx%cl %rdx%edx%dx%dl %rsi%esi%si%sil %rdi%edi%di%axl %rbp%ebp%bp%bp %rsp%esp%sp%spl %r9%r9d%r9w%r9b %r10%r10d%r10w%r10b %r11%r11d%r11w%r11b %r12%r12d%r12w%r12b %r13%r13d%r13w%r13b %r14%r14d%r14w%r14b %r15%r15d%r15w%r15b %r16%r16d%r16w%r16b There's other registers present, such as the 64-bit %mm registers, 128-bit %xmm registers, 256-bit %ymm registers, and 512-bit %zmm registers. Except for %mm registers, these registers may not present on older AMD64 processors. Assembly Source The following is the source for a C program, helloas1.c, that calls an assembly function, hello_asm(). $ cat helloas1.c extern void hello_asm(char *s); int main(void) { hello_asm("Hello, World!"); } The assembly function called above, hello_asm(), is defined below. $ cat helloas2.s /* * helloas2.s * To build: * cc -m64 -o helloas2-cpp.s -D_ASM -E helloas2.s * cc -m64 -c -o helloas2.o helloas2-cpp.s */ #if defined(lint) || defined(__lint) /* ARGSUSED */ void hello_asm(char *s) { } #else /* lint */ #include <sys/asm_linkage.h> .extern printf ENTRY_NP(hello_asm) // Setup printf parameters on stack mov %rdi, %rsi // P2 (%rsi) is string variable lea .printf_string, %rdi // P1 (%rdi) is printf format string call printf ret SET_SIZE(hello_asm) // Read-only data .text .align 16 .type .printf_string, @object .printf_string: .ascii "The string is: %s.\n\0" #endif /* lint || __lint */ In the assembly source above, the C skeleton code under "#if defined(lint)" is optionally used for lint to check the interfaces with your C program--very useful to catch nasty interface bugs. The "asm_linkage.h" file includes some handy macros useful for assembly, such as ENTRY_NP(), used to define a program entry point, and SET_SIZE(), used to set the function size in the symbol table. The function hello_asm calls C function printf() by passing two parameters, Parameter 1 (P1) is a printf format string, and P2 is a string variable. The function begins by moving %rdi, which contains Parameter 1 (P1) passed hello_asm, to printf()'s P2, %rsi. Then it sets printf's P1, the format string, by loading the address the address of the format string in %rdi, P1. Finally it calls printf. After returning from printf, the hello_asm function returns itself. Larger, more complex assembly functions usually do more setup than the example above. If a function is returning a value, it would set %rax to the return value. Also, it's typical for a function to save the %rbp and %rsp registers of the calling function and to restore these registers before returning. %rsp contains the stack pointer and %rbp contains the frame pointer. Here is the typical function setup and return sequence for a function: ENTRY_NP(sample_assembly_function) push %rbp // save frame pointer on stack mov %rsp, %rbp // save stack pointer in frame pointer xor %rax, %r4ax // set function return value to 0. mov %rbp, %rsp // restore stack pointer pop %rbp // restore frame pointer ret // return to calling function SET_SIZE(sample_assembly_function) Compiling and Running Assembly Use the Solaris cc command to compile both C and assembly source, and to pre-process assembly source. You can also use GNU gcc instead of cc to compile, if you prefer. The "-m64" option tells the compiler to compile in 64-bit address mode (instead of 32-bit). $ cc -m64 -o helloas2-cpp.s -D_ASM -E helloas2.s $ cc -m64 -c -o helloas2.o helloas2-cpp.s $ cc -m64 -c helloas1.c $ cc -m64 -o hello-asm helloas1.o helloas2.o $ file hello-asm helloas1.o helloas2.o hello-asm: ELF 64-bit LSB executable AMD64 Version 1 [SSE FXSR FPU], dynamically linked, not stripped helloas1.o: ELF 64-bit LSB relocatable AMD64 Version 1 helloas2.o: ELF 64-bit LSB relocatable AMD64 Version 1 $ hello-asm The string is: Hello, World!. Debugging Assembly with MDB MDB is the Solaris system debugger. It can also be used to debug user programs, including assembly and C. The following example runs the above program, hello-asm, under control of the debugger. In the example below I load the program, set a breakpoint at the assembly function hello_asm, display the registers and the first parameter, step through the assembly function, and continue execution. $ mdb hello-asm # Start the debugger > hello_asm:b # Set a breakpoint > ::run # Run the program under the debugger mdb: stop at hello_asm mdb: target stopped at: hello_asm: movq %rdi,%rsi > $C # display function stack ffff80ffbffff6e0 hello_asm() ffff80ffbffff6f0 0x400adc() > $r # display registers %rax = 0x0000000000000000 %r8 = 0x0000000000000000 %rbx = 0xffff80ffbf7f8e70 %r9 = 0x0000000000000000 %rcx = 0x0000000000000000 %r10 = 0x0000000000000000 %rdx = 0xffff80ffbffff718 %r11 = 0xffff80ffbf537db8 %rsi = 0xffff80ffbffff708 %r12 = 0x0000000000000000 %rdi = 0x0000000000400cf8 %r13 = 0x0000000000000000 %r14 = 0x0000000000000000 %r15 = 0x0000000000000000 %cs = 0x0053 %fs = 0x0000 %gs = 0x0000 %ds = 0x0000 %es = 0x0000 %ss = 0x004b %rip = 0x0000000000400c70 hello_asm %rbp = 0xffff80ffbffff6e0 %rsp = 0xffff80ffbffff6c8 %rflags = 0x00000282 id=0 vip=0 vif=0 ac=0 vm=0 rf=0 nt=0 iopl=0x0 status=<of,df,IF,tf,SF,zf,af,pf,cf> %gsbase = 0x0000000000000000 %fsbase = 0xffff80ffbf782a40 %trapno = 0x3 %err = 0x0 > ::dis # disassemble the current instructions hello_asm: movq %rdi,%rsi hello_asm+3: leaq 0x400c90,%rdi hello_asm+0xb: call -0x220 <PLT:printf> hello_asm+0x10: ret 0x400c81: nop 0x400c85: nop 0x400c88: nop 0x400c8c: nop 0x400c90: pushq %rsp 0x400c91: pushq $0x74732065 0x400c96: jb +0x69 <0x400d01> > 0x0000000000400cf8/S # %rdi contains Parameter 1 0x400cf8: Hello, World! > [ # Step and execute 1 instruction mdb: target stopped at: hello_asm+3: leaq 0x400c90,%rdi > [ mdb: target stopped at: hello_asm+0xb: call -0x220 <PLT:printf> > [ The string is: Hello, World!. mdb: target stopped at: hello_asm+0x10: ret > [ mdb: target stopped at: main+0x19: movl $0x0,-0x4(%rbp) > :c # continue program execution mdb: target has terminated > $q # quit the MDB debugger $ In the example above, at the start of function hello_asm(), I display the stack contents with "$C", display the registers contents with "$r", then disassemble the current function with "::dis". The first function parameter, which is a C string, is passed by reference with the string address in %rdi (see the register usage chart above). The address is 0x400cf8, so I print the value of the string with the "/S" MDB command: "0x0000000000400cf8/S". I can also print the contents at an address in several other formats. Here's a few popular formats. For more, see the mdb(1) man page for details. address/S C string address/C ASCII character (1 byte) address/E unsigned decimal (8 bytes) address/U unsigned decimal (4 bytes) address/D signed decimal (4 bytes) address/J hexadecimal (8 bytes) address/X hexadecimal (4 bytes) address/B hexadecimal (1 bytes) address/K pointer in hexadecimal (4 or 8 bytes) address/I disassembled instruction Finally, I step through each machine instruction with the "[" command, which steps over functions. If I wanted to enter a function, I would use the "]" command. Then I continue program execution with ":c", which continues until the program terminates. MDB Basic Cheat Sheet Here's a brief cheat sheet of some of the more common MDB commands useful for assembly debugging. There's an entire set of macros and more powerful commands, especially some for debugging the Solaris kernel, but that's beyond the scope of this example. $C Display function stack with pointers $c Display function stack $e Display external function names $v Display non-zero variables and registers $r Display registers ::fpregs Display floating point (or "media" registers). Includes %st, %xmm, and %ymm registers. ::status Display program status ::run Run the program (followed by optional command line parameters) $q Quit the debugger address:b Set a breakpoint address:d Delete a breakpoint $b Display breakpoints :c Continue program execution after a breakpoint [ Step 1 instruction, but step over function calls ] Step 1 instruction address::dis Disassemble instructions at an address ::events Display events Further Information "Assembly Language Techniques for Oracle Solaris on x86 Platforms" by Paul Lowik (2004). Good tutorial on Solaris x86 optimization with assembly. The Solaris Operating System on x86 Platforms An excellent, detailed tutorial on X86 architecture, with Solaris specifics. By an ex-Sun employee, Frank Hofmann (2005). "AMD64 ABI Features", Solaris 64-bit Developer's Guide contains rules on data types and register usage for Intel 64/AMD64-class processors. (available at docs.oracle.com) Solaris X86 Assembly Language Reference Manual (available at docs.oracle.com) SPARC Assembly Language Reference Manual (available at docs.oracle.com) System V Application Binary Interface (2003) defines the AMD64 ABI for UNIX-class operating systems, including Solaris, Linux, and BSD. Google for it—the original website is gone. cc(1), gcc(1), and mdb(1) man pages.

Read the article
Investigation: Can different combinations of components effect Dataflow performance?

- by jamiet

Introduction The Dataflow task is one of the core components (if not the core component) of SQL Server Integration Services (SSIS) and often the most misunderstood. This is not surprising, its an incredibly complicated beast and we’re abstracted away from that complexity via some boxes that go yellow red or green and that have some lines drawn between them. Example dataflow In this blog post I intend to look under that facade and get into some of the nuts and bolts of the Dataflow Task by investigating how the decisions we make when building our packages can affect performance. I will do this by comparing the performance of three dataflows that all have the same input, all produce the same output, but which all operate slightly differently by way of having different transformation components. I also want to use this blog post to challenge a common held opinion that I see perpetuated over and over again on the SSIS forum. That is, that people assume adding components to a dataflow will be detrimental to overall performance. Its not surprising that people think this –it is intuitive to think that more components means more work- however this is not a view that I share. I have always been of the opinion that there are many factors affecting dataflow duration and the number of components is actually one of the less important ones; having said that I have never proven that assertion and that is one reason for this investigation. I have actually seen evidence that some people think dataflow duration is simply a function of number of rows and number of components. I’ll happily call that one out as a myth even without any investigation! The Setup I have a 2GB datafile which is a list of 4731904 (~4.7million) customer records with various attributes against them and it contains 2 columns that I am going to use for categorisation: [YearlyIncome] [BirthDate] The data file is a SSIS raw format file which I chose to use because it is the quickest way of getting data into a dataflow and given that I am testing the transformations, not the source or destination adapters, I want to minimise external influences as much as possible. In the test I will split the customers according to month of birth (12 of those) and whether or not their yearly income is above or below 50000 (2 of those); in other words I will be splitting them into 24 discrete categories and in order to do it I shall be using different combinations of SSIS’ Conditional Split and Derived Column transformation components. The 24 datapaths that occur will each input to a rowcount component, again because this is the least resource intensive means of terminating a datapath. The test is being carried out on a Dell XPS Studio laptop with a quad core (8 logical Procs) Intel Core i7 at 1.73GHz and Samsung SSD hard drive. Its running SQL Server 2008 R2 on Windows 7. The Variables Here are the three combinations of components that I am going to test: One Conditional Split - A single Conditional Split component CSPL Split by Month of Birth and income category that will use expressions on [YearlyIncome] & [BirthDate] to send each row to one of 24 outputs. This next screenshot displays the expression logic in use: Derived Column & Conditional Split - A Derived Column component DER Income Category that adds a new column [IncomeCategory] which will contain one of two possible text values {“LessThan50000”,”GreaterThan50000”} and uses [YearlyIncome] to determine which value each row should get. A Conditional Split component CSPL Split by Month of Birth and Income Category then uses that new column in conjunction with [BirthDate] to determine which of the same 24 outputs to send each row to. Put more simply, I am separating the Conditional Split of #1 into a Derived Column and a Conditional Split. The next screenshots display the expression logic in use: DER Income Category CSPL Split by Month of Birth and Income Category Three Conditional Splits - A Conditional Split component that produces two outputs based on [YearlyIncome], one for each Income Category. Each of those outputs will go to a further Conditional Split that splits the input into 12 outputs, one for each month of birth (identical logic in each). In this case then I am separating the single Conditional Split of #1 into three Conditional Split components. The next screenshots display the expression logic in use: CSPL Split by Income Category CSPL Split by Month of Birth 1& 2 Each of these combinations will provide an input to one of the 24 rowcount components, just the same as before. For illustration here is a screenshot of the dataflow containing three Conditional Split components: As you can these dataflows have a fair bit of work to do and remember that they’re doing that work for 4.7million rows. I will execute each dataflow 10 times and use the average for comparison. I foresee three possible outcomes: The dataflow containing just one Conditional Split (i.e. #1) will be quicker There is no significant difference between any of them One of the two dataflows containing multiple transformation components will be quicker Regardless of which of those outcomes come to pass we will have learnt something and that makes this an interesting test to carry out. Note that I will be executing the dataflows using dtexec.exe rather than hitting F5 within BIDS. The Results and Analysis The table below shows all of the executions, 10 for each dataflow. It also shows the average for each along with a standard deviation. All durations are in seconds. I’m pasting a screenshot because I frankly can’t be bothered with the faffing about needed to make a presentable HTML table. It is plain to see from the average that the dataflow containing three conditional splits is significantly faster, the other two taking 43% and 52% longer respectively. This seems strange though, right? Why does the dataflow containing the most components outperform the other two by such a big margin? The answer is actually quite logical when you put some thought into it and I’ll explain that below. Before progressing, a side note. The standard deviation for the “Three Conditional Splits” dataflow is orders of magnitude smaller – indicating that performance for this dataflow can be predicted with much greater confidence too. The Explanation I refer you to the screenshot above that shows how CSPL Split by Month of Birth and salary category in the first dataflow is setup. Observe that there is a case for each combination of Month Of Date and Income Category – 24 in total. These expressions get evaluated in the order that they appear and hence if we assume that Month of Date and Income Category are uniformly distributed in the dataset we can deduce that the expected number of expression evaluations for each row is 12.5 i.e. 1 (the minimum) + 24 (the maximum) divided by 2 = 12.5. Now take a look at the screenshots for the second dataflow. We are doing one expression evaluation in DER Income Category and we have the same 24 cases in CSPL Split by Month of Birth and Income Category as we had before, only the expression differs slightly. In this case then we have 1 + 12.5 = 13.5 expected evaluations for each row – that would account for the slightly longer average execution time for this dataflow. Now onto the third dataflow, the quick one. CSPL Split by Income Category does a maximum of 2 expression evaluations thus the expected number of evaluations per row is 1.5. CSPL Split by Month of Birth 1 & CSPL Split by Month of Birth 2 both have less work to do than the previous Conditional Split components because they only have 12 cases to test for thus the expected number of expression evaluations is 6.5 There are two of them so total expected number of expression evaluations for this dataflow is 6.5 + 6.5 + 1.5 = 14.5. 14.5 is still more than 12.5 & 13.5 though so why is the third dataflow so much quicker? Simple, the conditional expressions in the first two dataflows have two boolean predicates to evaluate – one for Income Category and one for Month of Birth; the expressions in the Conditional Split in the third dataflow however only have one predicate thus they are doing a lot less work. To sum up, the difference in execution times can be attributed to the difference between: MONTH(BirthDate) == 1 && YearlyIncome <= 50000 and MONTH(BirthDate) == 1 In the first two dataflows YearlyIncome <= 50000 gets evaluated an average of 12.5 times for every row whereas in the third dataflow it is evaluated once and once only. Multiply those 11.5 extra operations by 4.7million rows and you get a significant amount of extra CPU cycles – that’s where our duration difference comes from. The Wrap-up The obvious point here is that adding new components to a dataflow isn’t necessarily going to make it go any slower, moreover you may be able to achieve significant improvements by splitting logic over multiple components rather than one. Performance tuning is all about reducing the amount of work that needs to be done and that doesn’t necessarily mean use less components, indeed sometimes you may be able to reduce workload in ways that aren’t immediately obvious as I think I have proven here. Of course there are many variables in play here and your mileage will most definitely vary. I encourage you to download the package and see if you get similar results – let me know in the comments. The package contains all three dataflows plus a fourth dataflow that will create the 2GB raw file for you (you will also need the [AdventureWorksDW2008] sample database from which to source the data); simply disable all dataflows except the one you want to test before executing the package and remember, execute using dtexec, not within BIDS. If you want to explore dataflow performance tuning in more detail then here are some links you might want to check out: Inequality joins, Asynchronous transformations and Lookups Destination Adapter Comparison Don’t turn the dataflow into a cursor SSIS Dataflow – Designing for performance (webinar) Any comments? Let me know! @Jamiet

Read the article
Why Are Minimized Programs Often Slow to Open Again?

- by Jason Fitzpatrick

It seems particularly counterintuitive: you minimize an application because you plan on returning to it later and wish to skip shutting the application down and restarting it later, but sometimes maximizing it takes even longer than launching it fresh. What gives? Today’s Question & Answer session comes to us courtesy of SuperUser—a subdivision of Stack Exchange, a community-driven grouping of Q&A web sites. The Question SuperUser reader Bart wants to know why he’s not saving any time with application minimization: I’m working in Photoshop CS6 and multiple browsers a lot. I’m not using them all at once, so sometimes some applications are minimized to taskbar for hours or days. The problem is, when I try to maximize them from the taskbar – it sometimes takes longer than starting them! Especially Photoshop feels really weird for many seconds after finally showing up, it’s slow, unresponsive and even sometimes totally freezes for minute or two. It’s not a hardware problem as it’s been like that since always on all on my PCs. Would I also notice it after upgrading my HDD to SDD and adding RAM (my main PC holds 4 GB currently)? Could guys with powerful pcs / macs tell me – does it also happen to you? I guess OSes somehow “focus” on active software and move all the resources away from the ones that run, but are not used. Is it possible to somehow set RAM / CPU / HDD priorities or something, for let’s say, Photoshop, so it won’t slow down after long period of inactivity? So what is the deal? Why does he find himself waiting to maximize a minimized app? The Answer SuperUser contributor Allquixotic explains why: Summary The immediate problem is that the programs that you have minimized are being paged out to the “page file” on your hard disk. This symptom can be improved by installing a Solid State Disk (SSD), adding more RAM to your system, reducing the number of programs you have open, or upgrading to a newer system architecture (for instance, Ivy Bridge or Haswell). Out of these options, adding more RAM is generally the most effective solution. Explanation The default behavior of Windows is to give active applications priority over inactive applications for having a spot in RAM. When there’s significant memory pressure (meaning the system doesn’t have a lot of free RAM if it were to let every program have all the RAM it wants), it starts putting minimized programs into the page file, which means it writes out their contents from RAM to disk, and then makes that area of RAM free. That free RAM helps programs you’re actively using — say, your web browser — run faster, because if they need to claim a new segment of RAM (like when you open a new tab), they can do so. This “free” RAM is also used as page cache, which means that when active programs attempt to read data on your hard disk, that data might be cached in RAM, which prevents your hard disk from being accessed to get that data. By using the majority of your RAM for page cache, and swapping out unused programs to disk, Windows is trying to improve responsiveness of the program(s) you are actively using, by making RAM available to them, and caching the files they access in RAM instead of the hard disk. The downside of this behavior is that minimized programs can take a while to have their contents copied from the page file, on disk, back into RAM. The time increases the larger the program’s footprint in memory. This is why you experience that delay when maximizing Photoshop. RAM is many times faster than a hard disk (depending on the specific hardware, it can be up to several orders of magnitude). An SSD is considerably faster than a hard disk, but it is still slower than RAM by orders of magnitude. Having your page file on an SSD will help, but it will also wear out the SSD more quickly than usual if your page file is heavily utilized due to RAM pressure. Remedies Here is an explanation of the available remedies, and their general effectiveness: Installing more RAM: This is the recommended path. If your system does not support more RAM than you already have installed, you will need to upgrade more of your system: possibly your motherboard, CPU, chassis, power supply, etc. depending on how old it is. If it’s a laptop, chances are you’ll have to buy an entire new laptop that supports more installed RAM. When you install more RAM, you reduce memory pressure, which reduces use of the page file, which is a good thing all around. You also make available more RAM for page cache, which will make all programs that access the hard disk run faster. As of Q4 2013, my personal recommendation is that you have at least 8 GB of RAM for a desktop or laptop whose purpose is anything more complex than web browsing and email. That means photo editing, video editing/viewing, playing computer games, audio editing or recording, programming / development, etc. all should have at least 8 GB of RAM, if not more. Run fewer programs at a time: This will only work if the programs you are running do not use a lot of memory on their own. Unfortunately, Adobe Creative Suite products such as Photoshop CS6 are known for using an enormous amount of memory. This also limits your multitasking ability. It’s a temporary, free remedy, but it can be an inconvenience to close down your web browser or Word every time you start Photoshop, for instance. This also wouldn’t stop Photoshop from being swapped when minimizing it, so it really isn’t a very effective solution. It only helps in some specific situations. Install an SSD: If your page file is on an SSD, the SSD’s improved speed compared to a hard disk will result in generally improved performance when the page file has to be read from or written to. Be aware that SSDs are not designed to withstand a very frequent and constant random stream of writes; they can only be written over a limited number of times before they start to break down. Heavy use of a page file is not a particularly good workload for an SSD. You should install an SSD in combination with a large amount of RAM if you want maximum performance while preserving the longevity of the SSD. Use a newer system architecture: Depending on the age of your system, you may be using an out of date system architecture. The “system architecture” is generally defined as the “generation” (think generations like children, parents, grandparents, etc.) of the motherboard and CPU. Newer generations generally support faster I/O (input/output), better memory bandwidth, lower latency, and less contention over shared resources, instead providing dedicated links between components. For example, starting with the “Nehalem” generation (around 2009), the Front-Side Bus (FSB) was eliminated, which removed a common bottleneck, because almost all system components had to share the same FSB for transmitting data. This was replaced with a “point to point” architecture, meaning that each component gets its own dedicated “lane” to the CPU, which continues to be improved every few years with new generations. You will generally see a more significant improvement in overall system performance depending on the “gap” between your computer’s architecture and the latest one available. For example, a Pentium 4 architecture from 2004 is going to see a much more significant improvement upgrading to “Haswell” (the latest as of Q4 2013) than a “Sandy Bridge” architecture from ~2010. Links Related questions: How to reduce disk thrashing (paging)? Windows Swap (Page File): Enable or Disable? Also, just in case you’re considering it, you really shouldn’t disable the page file, as this will only make matters worse; see here. And, in case you needed extra convincing to leave the Windows Page File alone, see here and here. Have something to add to the explanation? Sound off in the the comments. Want to read more answers from other tech-savvy Stack Exchange users? Check out the full discussion thread here.

Read the article
Effective optimization strategies on modern C++ compilers

- by user168715

I'm working on scientific code that is very performance-critical. An initial version of the code has been written and tested, and now, with profiler in hand, it's time to start shaving cycles from the hot spots. It's well-known that some optimizations, e.g. loop unrolling, are handled these days much more effectively by the compiler than by a programmer meddling by hand. Which techniques are still worthwhile? Obviously, I'll run everything I try through a profiler, but if there's conventional wisdom as to what tends to work and what doesn't, it would save me significant time. I know that optimization is very compiler- and architecture- dependent. I'm using Intel's C++ compiler targeting the Core 2 Duo, but I'm also interested in what works well for gcc, or for "any modern compiler." Here are some concrete ideas I'm considering: Is there any benefit to replacing STL containers/algorithms with hand-rolled ones? In particular, my program includes a very large priority queue (currently a std::priority_queue) whose manipulation is taking a lot of total time. Is this something worth looking into, or is the STL implementation already likely the fastest possible? Along similar lines, for std::vectors whose needed sizes are unknown but have a reasonably small upper bound, is it profitable to replace them with statically-allocated arrays? I've found that dynamic memory allocation is often a severe bottleneck, and that eliminating it can lead to significant speedups. As a consequence I'm interesting in the performance tradeoffs of returning large temporary data structures by value vs. returning by pointer vs. passing the result in by reference. Is there a way to reliably determine whether or not the compiler will use RVO for a given method (assuming the caller doesn't need to modify the result, of course)? How cache-aware do compilers tend to be? For example, is it worth looking into reordering nested loops? Given the scientific nature of the program, floating-point numbers are used everywhere. A significant bottleneck in my code used to be conversions from floating point to integers: the compiler would emit code to save the current rounding mode, change it, perform the conversion, then restore the old rounding mode --- even though nothing in the program ever changed the rounding mode! Disabling this behavior significantly sped up my code. Are there any similar floating-point-related gotchas I should be aware of? One consequence of C++ being compiled and linked separately is that the compiler is unable to do what would seem to be very simple optimizations, such as move method calls like strlen() out of the termination conditions of loop. Are there any optimization like this one that I should look out for because they can't be done by the compiler and must be done by hand? On the flip side, are there any techniques I should avoid because they are likely to interfere with the compiler's ability to automatically optimize code? Lastly, to nip certain kinds of answers in the bud: I understand that optimization has a cost in terms of complexity, reliability, and maintainability. For this particular application, increased performance is worth these costs. I understand that the best optimizations are often to improve the high-level algorithms, and this has already been done.

Read the article
IXmlSerializable Dictionary

- by Shimmy

I was trying to create a generic Dictionary that implements IXmlSerializable. Here is my trial: Sub Main() Dim z As New SerializableDictionary(Of String, String) z.Add("asdf", "asd") Console.WriteLine(z.Serialize) End Sub Result: <?xml version="1.0" encoding="utf-16"?><Entry key="asdf" value="asd" /> I placed a breakpoint on top of the WriteXml method and I see that when it stops, the writer contains no data at all, and IMHO it should contain the root element and the xml declaration. <Serializable()> _ Public Class SerializableDictionary(Of TKey, TValue) : Inherits Dictionary(Of TKey, TValue) : Implements IXmlSerializable Private Const EntryString As String = "Entry" Private Const KeyString As String = "key" Private Const ValueString As String = "value" Private Shared ReadOnly AttributableTypes As Type() = New Type() {GetType(Boolean), GetType(Byte), GetType(Char), GetType(DateTime), GetType(Decimal), GetType(Double), GetType([Enum]), GetType(Guid), GetType(Int16), GetType(Int32), GetType(Int64), GetType(SByte), GetType(Single), GetType(String), GetType(TimeSpan), GetType(UInt16), GetType(UInt32), GetType(UInt64)} Private Shared ReadOnly GetIsAttributable As Predicate(Of Type) = Function(t) AttributableTypes.Contains(t) Private Shared ReadOnly IsKeyAttributable As Boolean = GetIsAttributable(GetType(TKey)) Private Shared ReadOnly IsValueAttributable As Boolean = GetIsAttributable(GetType(TValue)) Private Shared ReadOnly GetElementName As Func(Of Boolean, String) = Function(isKey) If(isKey, KeyString, ValueString) Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema Return Nothing End Function Public Sub WriteXml(ByVal writer As XmlWriter) Implements IXmlSerializable.WriteXml For Each entry In Me writer.WriteStartElement(EntryString) WriteData(IsKeyAttributable, writer, True, entry.Key) WriteData(IsValueAttributable, writer, False, entry.Value) writer.WriteEndElement() Next End Sub Private Sub WriteData(Of T)(ByVal attributable As Boolean, ByVal writer As XmlWriter, ByVal isKey As Boolean, ByVal value As T) Dim name = GetElementName(isKey) If attributable Then writer.WriteAttributeString(name, value.ToString) Else Dim serializer As New XmlSerializer(GetType(T)) writer.WriteStartElement(name) serializer.Serialize(writer, value) writer.WriteEndElement() End If End Sub Public Sub ReadXml(ByVal reader As XmlReader) Implements IXmlSerializable.ReadXml Dim empty = reader.IsEmptyElement reader.Read() If empty Then Exit Sub Clear() While reader.NodeType <> XmlNodeType.EndElement While reader.NodeType = XmlNodeType.Whitespace reader.Read() Dim key = ReadData(Of TKey)(IsKeyAttributable, reader, True) Dim value = ReadData(Of TValue)(IsValueAttributable, reader, False) Add(key, value) If Not IsKeyAttributable AndAlso Not IsValueAttributable Then reader.ReadEndElement() Else reader.Read() While reader.NodeType = XmlNodeType.Whitespace reader.Read() End While End While reader.ReadEndElement() End While End Sub Private Function ReadData(Of T)(ByVal attributable As Boolean, ByVal reader As XmlReader, ByVal isKey As Boolean) As T Dim name = GetElementName(isKey) Dim type = GetType(T) If attributable Then Return Convert.ChangeType(reader.GetAttribute(name), type) Else Dim serializer As New XmlSerializer(type) While reader.Name <> name reader.Read() End While reader.ReadStartElement(name) Dim value = serializer.Deserialize(reader) reader.ReadEndElement() Return value End If End Function Public Shared Function Serialize(ByVal dictionary As SerializableDictionary(Of TKey, TValue)) As String Dim sb As New StringBuilder(1024) Dim sw As New StringWriter(sb) Dim xs As New XmlSerializer(GetType(SerializableDictionary(Of TKey, TValue))) xs.Serialize(sw, dictionary) sw.Dispose() Return sb.ToString End Function Public Shared Function Deserialize(ByVal xml As String) As SerializableDictionary(Of TKey, TValue) Dim xs As New XmlSerializer(GetType(SerializableDictionary(Of TKey, TValue))) Dim xr As New XmlTextReader(xml, XmlNodeType.Document, Nothing) Return xs.Deserialize(xr) xr.Close() End Function Public Function Serialize() As String Dim sb As New StringBuilder Dim xw = XmlWriter.Create(sb) WriteXml(xw) xw.Close() Return sb.ToString End Function Public Sub Parse(ByVal xml As String) Dim xr As New XmlTextReader(xml, XmlNodeType.Document, Nothing) ReadXml(xr) xr.Close() End Sub End Class

Read the article
Conditions with whitespaces in activerecord find

- by so1o

So i am trying to do this Order.find :all, :conditions => "org = 'test org'" what ends up firing is SELECT * FROM `orders` WHERE (org = 'test org') the whitespace in the argument gets stripped. what am i missing.. im really stumped here. please help!

Read the article
Find and replace in a block of text.

- by David Tildon

I have a database full of text fields that look like this: (paragraph of normal text) image:blog/clownboy.jpg (another paragraph) I'm trying to write a view helper for Rails that will take one of these big blocks of text, find bits like "image:blog/clownboy.jpg" and replace them with the corresponding <img src="blog/clownboy.jpg"> (without disturbing the surrounding whitespace) before outputting it to the user. I've been trying for an hour or so now, but I'm new to Ruby and the regular expressions are still a bit beyond me.

Read the article
In Objective C, what's the best way to extract multiple substrings of text around multiple patterns?

- by Matt

For one NSString, I have N pattern strings. I'd like to extract substrings "around" the pattern matches. So, if i have "the quick brown fox jumped over the lazy dog" and my patterns are "brown" and "lazy" i would like to get "quick brown fox" and "the lazy dog." However, the substrings don't necessarily need to be delimited by whitespace. I have a hunch that there's a very easy solution to this, but I admit a disturbing lack of knowledge of Objective C string functions.

Read the article
IXmlSerializable Dictionary problem

- by Shimmy

I was trying to create a generic Dictionary that implements IXmlSerializable. Here is my trial: Sub Main() Dim z As New SerializableDictionary(Of String, String) z.Add("asdf", "asd") Console.WriteLine(z.Serialize) End Sub Result: <?xml version="1.0" encoding="utf-16"?><Entry key="asdf" value="asd" /> I placed a breakpoint on top of the WriteXml method and I see that when it stops, the writer contains no data at all, and IMHO it should contain the root element and the xml declaration. <Serializable()> _ Public Class SerializableDictionary(Of TKey, TValue) : Inherits Dictionary(Of TKey, TValue) : Implements IXmlSerializable Private Const EntryString As String = "Entry" Private Const KeyString As String = "key" Private Const ValueString As String = "value" Private Shared ReadOnly AttributableTypes As Type() = New Type() {GetType(Boolean), GetType(Byte), GetType(Char), GetType(DateTime), GetType(Decimal), GetType(Double), GetType([Enum]), GetType(Guid), GetType(Int16), GetType(Int32), GetType(Int64), GetType(SByte), GetType(Single), GetType(String), GetType(TimeSpan), GetType(UInt16), GetType(UInt32), GetType(UInt64)} Private Shared ReadOnly GetIsAttributable As Predicate(Of Type) = Function(t) AttributableTypes.Contains(t) Private Shared ReadOnly IsKeyAttributable As Boolean = GetIsAttributable(GetType(TKey)) Private Shared ReadOnly IsValueAttributable As Boolean = GetIsAttributable(GetType(TValue)) Private Shared ReadOnly GetElementName As Func(Of Boolean, String) = Function(isKey) If(isKey, KeyString, ValueString) Public Function GetSchema() As System.Xml.Schema.XmlSchema Implements System.Xml.Serialization.IXmlSerializable.GetSchema Return Nothing End Function Public Sub WriteXml(ByVal writer As XmlWriter) Implements IXmlSerializable.WriteXml For Each entry In Me writer.WriteStartElement(EntryString) WriteData(IsKeyAttributable, writer, True, entry.Key) WriteData(IsValueAttributable, writer, False, entry.Value) writer.WriteEndElement() Next End Sub Private Sub WriteData(Of T)(ByVal attributable As Boolean, ByVal writer As XmlWriter, ByVal isKey As Boolean, ByVal value As T) Dim name = GetElementName(isKey) If attributable Then writer.WriteAttributeString(name, value.ToString) Else Dim serializer As New XmlSerializer(GetType(T)) writer.WriteStartElement(name) serializer.Serialize(writer, value) writer.WriteEndElement() End If End Sub Public Sub ReadXml(ByVal reader As XmlReader) Implements IXmlSerializable.ReadXml Dim empty = reader.IsEmptyElement reader.Read() If empty Then Exit Sub Clear() While reader.NodeType <> XmlNodeType.EndElement While reader.NodeType = XmlNodeType.Whitespace reader.Read() Dim key = ReadData(Of TKey)(IsKeyAttributable, reader, True) Dim value = ReadData(Of TValue)(IsValueAttributable, reader, False) Add(key, value) If Not IsKeyAttributable AndAlso Not IsValueAttributable Then reader.ReadEndElement() Else reader.Read() While reader.NodeType = XmlNodeType.Whitespace reader.Read() End While End While reader.ReadEndElement() End While End Sub Private Function ReadData(Of T)(ByVal attributable As Boolean, ByVal reader As XmlReader, ByVal isKey As Boolean) As T Dim name = GetElementName(isKey) Dim type = GetType(T) If attributable Then Return Convert.ChangeType(reader.GetAttribute(name), type) Else Dim serializer As New XmlSerializer(type) While reader.Name <> name reader.Read() End While reader.ReadStartElement(name) Dim value = serializer.Deserialize(reader) reader.ReadEndElement() Return value End If End Function Public Shared Function Serialize(ByVal dictionary As SerializableDictionary(Of TKey, TValue)) As String Dim sb As New StringBuilder(1024) Dim sw As New StringWriter(sb) Dim xs As New XmlSerializer(GetType(SerializableDictionary(Of TKey, TValue))) xs.Serialize(sw, dictionary) sw.Dispose() Return sb.ToString End Function Public Shared Function Deserialize(ByVal xml As String) As SerializableDictionary(Of TKey, TValue) Dim xs As New XmlSerializer(GetType(SerializableDictionary(Of TKey, TValue))) Dim xr As New XmlTextReader(xml, XmlNodeType.Document, Nothing) Return xs.Deserialize(xr) xr.Close() End Function Public Function Serialize() As String Dim sb As New StringBuilder Dim xw = XmlWriter.Create(sb) WriteXml(xw) xw.Close() Return sb.ToString End Function Public Sub Parse(ByVal xml As String) Dim xr As New XmlTextReader(xml, XmlNodeType.Document, Nothing) ReadXml(xr) xr.Close() End Sub End Class

Read the article
Regex to Match White Space or End of String

- by Kirk

I'm trying to find every instance of @username in comment text and replace it with a link. Here's my PHP so far: $comment = preg_replace('/@(.+?)\s/', '<a href="/users/${1}/">@${1}</a> ', $comment); The only problem is the regex is dependent upon there being whitespace after the @username reference. Can anyone help me tweak this so it will also match if it is at the end of the string?

Read the article
QTableWidget: How can I get tighter lines with less vertical spacing padding?

- by Dirk Eddelbuettel

The QTableWdiget is fabulous for simple grid displays. Changing colors, fonts, etc is straightforward. However, I did not manage to give the grid a 'tighter' look with less vertical whitespace. I see that the Qt documentation talks (eg here) about margin border padding around widgets, but when I set these I only get changes around the entire grid widget rather than inside. How can I set this (with a style sheet, or hard-coded options) directly to make the QTableWidget display tighter?

Read the article
Optimizing a lot of Scanner.findWithinHorizon(pattern, 0) calls

- by darvids0n

I'm building a process which extracts data from 6 csv-style files and two poorly laid out .txt reports and builds output CSVs, and I'm fully aware that there's going to be some overhead searching through all that whitespace thousands of times, but I never anticipated converting about about 50,000 records would take 12 hours. Excerpt of my manual matching code (I know it's horrible that I use lists of tokens like that, but it was the best thing I could think of): public static String lookup(List<String> tokensBefore, List<String> tokensAfter) { String result = null; while(_match(tokensBefore)) { // block until all input is read if(id.hasNext()) { result = id.next(); // capture the next token that matches if(_matchImmediate(tokensAfter)) // try to match tokensAfter to this result return result; } else return null; // end of file; no match } return null; // no matches } private static boolean _match(List<String> tokens) { return _match(tokens, true); } private static boolean _match(List<String> tokens, boolean block) { if(tokens != null && !tokens.isEmpty()) { if(id.findWithinHorizon(tokens.get(0), 0) == null) return false; for(int i = 1; i <= tokens.size(); i++) { if (i == tokens.size()) { // matches all tokens return true; } else if(id.hasNext() && !id.next().matches(tokens.get(i))) { break; // break to blocking behaviour } } } else { return true; // empty list always matches } if(block) return _match(tokens); // loop until we find something or nothing else return false; // return after just one attempted match } private static boolean _matchImmediate(List<String> tokens) { if(tokens != null) { for(int i = 0; i <= tokens.size(); i++) { if (i == tokens.size()) { // matches all tokens return true; } else if(!id.hasNext() || !id.next().matches(tokens.get(i))) { return false; // doesn't match, or end of file } } return false; // we have some serious problems if this ever gets called } else { return true; // empty list always matches } } Basically wondering how I would work in an efficient string search (Boyer-Moore or similar). My Scanner id is scanning a java.util.String, figured buffering it to memory would reduce I/O since the search here is being performed thousands of times on a relatively small file. The performance increase compared to scanning a BufferedReader(FileReader(File)) was probably less than 1%, the process still looks to be taking a LONG time. I've also traced execution and the slowness of my overall conversion process is definitely between the first and last like of the lookup method. In fact, so much so that I ran a shortcut process to count the number of occurrences of various identifiers in the .csv-style files (I use 2 lookup methods, this is just one of them) and the process completed indexing approx 4 different identifiers for 50,000 records in less than a minute. Compared to 12 hours, that's instant. Some notes (updated): I don't necessarily need the pattern-matching behaviour, I only get the first field of a line of text so I need to match line breaks or use Scanner.nextLine(). All ID numbers I need start at position 0 of a line and run through til the first block of whitespace, after which is the name of the corresponding object. I would ideally want to return a String, not an int locating the line number or start position of the result, but if it's faster then it will still work just fine. If an int is being returned, however, then I would now have to seek to that line again just to get the ID; storing the ID of every line that is searched sounds like a way around that. Anything to help me out, even if it saves 1ms per search, will help, so all input is appreciated. Thankyou! Usage scenario 1: I have a list of objects in file A, who in the old-style system have an id number which is not in file A. It is, however, POSSIBLY in another csv-style file (file B) or possibly still in a .txt report (file C) which each also contain a bunch of other information which is not useful here, and so file B needs to be searched through for the object's full name (1 token since it would reside within the second column of any given line), and then the first column should be the ID number. If that doesn't work, we then have to split the search token by whitespace into separate tokens before doing a search of file C for those tokens as well. Generalised code: String field; for (/* each record in file A */) { /* construct the rest of this object from file A info */ // now to find the ID, if we can List<String> objectName = new ArrayList<String>(1); objectName.add(Pattern.quote(thisObject.fullName)); field = lookup(objectSearchToken, objectName); // search file B if(field == null) // not found in file B { lookupReset(false); // initialise scanner to check file C objectName.clear(); // not using the full name String[] tokens = thisObject.fullName.split(id.delimiter().pattern()); for(String s : tokens) objectName.add(Pattern.quote(s)); field = lookup(objectSearchToken, objectName); // search file C lookupReset(true); // back to file B } else { /* found it, file B specific processing here */ } if(field != null) // found it in B or C thisObject.ID = field; } The objectName tokens are all uppercase words with possible hyphens or apostrophes in them, separated by spaces. Much like a person's name. As per a comment, I will pre-compile the regex for my objectSearchToken, which is just [\r\n]+. What's ending up happening in file C is, every single line is being checked, even the 95% of lines which don't contain an ID number and object name at the start. Would it be quicker to use ^[\r\n]+.*(objectname) instead of two separate regexes? It may reduce the number of _match executions. The more general case of that would be, concatenate all tokensBefore with all tokensAfter, and put a .* in the middle. It would need to be matching backwards through the file though, otherwise it would match the correct line but with a huge .* block in the middle with lots of lines. The above situation could be resolved if I could get java.util.Scanner to return the token previous to the current one after a call to findWithinHorizon. I have another usage scenario. Will put it up asap.

Read the article
Valid URL-string as NSUrl becomes null

- by Mike

Hello, another issue where I seem to have found an solution for ObjC but not MonoTouch. I want a NSUrl from an URL (as string). The string may contain whitespace and backslashes. Why is NSUrl returning null for such string, even though these are valid urls in a browser? For example: NSUrl foo = NSUrl.FromString(@"http://google.com/search?\query"); foo == null Any suggestions?

Read the article

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >