Search Results

Search found 16404 results on 657 pages for 'easy transfer'.

Page 124/657 | < Previous Page | 120 121 122 123 124 125 126 127 128 129 130 131 | Next Page >

Why does Java read its default settings from the system

- by Bozho

Java is reading the locale, timezone and encoding information (and perhaps more) from the system it is installed on. This often brings bad surprises (brought me one just yesterday). Say your development and production servers are set to have TimeZone GMT+2. Then you deploy on a production server set to GMT. a 2-hour shift may not be easy to observe immediately. And although you can pass a TimeZone to your calendars, APIs might be instantiating calendars (or dates) using the default timezone. Now, I know one should be careful with these settings, but are easy to miss, hence make programs more error-prone. So, why doesn't Java have its own defaults - UTF-8, GMT, en_US (yes, I'm on non-en_US locale, but having it as default is fine). Applications could read the system settings via some API, if needed. Thus programs would be more predictable. So, what is the reason behind this decision?

Read the article
XSS prevention in PHP

- by newbie

I need to prevent XSS attacks in PHP code, is there nay good and easy library for this?

Read the article
How Do I Prevent a XSS Cross-Site Scripting Attack When Using jQueryUI Autocomplete

- by theschmitzer

I am checking for XSS vulnerabilities in a web application I am developing. This Rails app uses the h method to sanitize HTML it generates. It does, however, make use of the jQueryUI autocomplete widget (new in latest release), where I don't have control over the generated HTML, and I see tags are not getting escaped there. The data fed to autocomplete is retrieved through a JSON request immediately before display. I Possibilities: 1) Autocomplete has an option to sanitize I don't know about 2) There is an easy way to do this in jQuery I don't know about 3) There is an easy way to do this in a Rails controller I don't know about (where I can't use the h method) 4) Disallow < symbol in the model Sugestions?

Read the article
View problem - how to show integer from activity in XML?

- by Oliver Goossens

Hi there, I started two days ago with android, gone through the hello android stuff and also started to read the Hello Android book, which is great. PROBLEM: I use in my app - VERY EASY APP- the XML output. So basically the main activity just tells the android to show the XML layout of main. But what if I have in the activity - code defined integer variable and I want this integer variable also be shown on the display? How do I PUSH the integer variable to the XML??? From main XML reference to other strings in XML is easy - @string/app_name ... but how do I use the integer variable from the activity? Please help Thank you Oliver

Read the article
Java .doc generation

- by bozo

Hi, anyone knows an easy method to generate mail merge .doc file from Java? So, I want to create a Word (95/97) document in Word, put some simple placeholders in it (only single value, no iterators and other advanced tags) like the ones used with mailmerge option, and then at runtime replace those placeholders with values from Java. One option is to use Jasperreports, but this would require that I create exact replica of non-trivial Word document in Jasper format, which is not easy and is hard to change later. Is there some method of filling placeholders in Word from Java, which does not require low-level document alteration with positioning and others low-level .doc tags from code, but something like this: docPreparer.fillPlaceholder('placeholder1', 'my real value from runtime'); Some CRMs do this via ActiveX control for internet explorer, and it works great (they use Word's mailmerge) but I need an all-Java solution. Ideas? Thanks, Bozo

Read the article
Can someone describe through code a practical example of backtracking with iteration instead of recursion?

- by chrisapotek

Recursion makes backtracking easy as it guarantees that you won't go through the same path again. So all ramifications of your path are visited just once. I am trying to convert a backtracking tail-recursive (with accumulators) algorithm to iteration. I heard it is supposed to be easy to convert a perfectly tail-recursive algorithm to iteration. But I am stuck in the backtracking part. Can anyone provide a example through code so myself and others can visualize how backtracking is done? I would think that a STACK is not needed here because I have a perfectly tail-recursive algorithm using accumulators, but I can be wrong here.

Read the article
objective-c setting NSDate to current UTC

- by Brodie4598

is there an easy way to init an NSDate with the current UTC date/time?

Read the article
Git on windows :|

- by Sonic Soul

i've been experimenting with git as my personal code rep.. and it has been a bit of a disaster with windows. i've used Subversion, CVS, and Perforce in the past.. none were as annoying to use as git. i've figured out the PGP part (for github), although my workstation no longer lets me check in, and after searching around it turns out that git bash is using putty which is not that reliable and should be configured with something else.. i was not able to configure it with windows shell extension for a nice visual of what is part of the repository, what is modified, and easy check ins, and easy pushes.. has anyone successfully configured some kind of windows shell client and can efficiently and quickly synchronize various machines? It just seems to be more pain to use than it is worth..

Read the article
css3 design tutorial/template etc

- by Adam

Hello, I search some tutorial/templates how made easy (lightwight) and nice css3 design, or made with some "css firmware" etc

Read the article
WebSocket Applications using Java: JSR 356 Early Draft Now Available (TOTD #183)

- by arungupta

WebSocket provide a full-duplex and bi-directional communication protocol over a single TCP connection. JSR 356 is defining a standard API for creating WebSocket applications in the Java EE 7 Platform. This Tip Of The Day (TOTD) will provide an introduction to WebSocket and how the JSR is evolving to support the programming model. First, a little primer on WebSocket! WebSocket is a combination of IETF RFC 6455 Protocol and W3C JavaScript API (still a Candidate Recommendation). The protocol defines an opening handshake and data transfer. The API enables Web pages to use the WebSocket protocol for two-way communication with the remote host. Unlike HTTP, there is no need to create a new TCP connection and send a chock-full of headers for every message exchange between client and server. The WebSocket protocol defines basic message framing, layered over TCP. Once the initial handshake happens using HTTP Upgrade, the client and server can send messages to each other, independent from the other. There are no pre-defined message exchange patterns of request/response or one-way between client and and server. These need to be explicitly defined over the basic protocol. The communication between client and server is pretty symmetric but there are two differences: A client initiates a connection to a server that is listening for a WebSocket request. A client connects to one server using a URI. A server may listen to requests from multiple clients on the same URI. Other than these two difference, the client and server behave symmetrically after the opening handshake. In that sense, they are considered as "peers". After a successful handshake, clients and servers transfer data back and forth in conceptual units referred as "messages". On the wire, a message is composed of one or more frames. Application frames carry payload intended for the application and can be text or binary data. Control frames carry data intended for protocol-level signaling. Now lets talk about the JSR! The Java API for WebSocket is worked upon as JSR 356 in the Java Community Process. This will define a standard API for building WebSocket applications. This JSR will provide support for: Creating WebSocket Java components to handle bi-directional WebSocket conversations Initiating and intercepting WebSocket events Creation and consumption of WebSocket text and binary messages The ability to define WebSocket protocols and content models for an application Configuration and management of WebSocket sessions, like timeouts, retries, cookies, connection pooling Specification of how WebSocket application will work within the Java EE security model Tyrus is the Reference Implementation for JSR 356 and is already integrated in GlassFish 4.0 Promoted Builds. And finally some code! The API allows to create WebSocket endpoints using annotations and interface. This TOTD will show a simple sample using annotations. A subsequent blog will show more advanced samples. A POJO can be converted to a WebSocket endpoint by specifying @WebSocketEndpoint and @WebSocketMessage. @WebSocketEndpoint(path="/hello")public class HelloBean { @WebSocketMessage public String sayHello(String name) { return "Hello " + name + "!"; }} @WebSocketEndpoint marks this class as a WebSocket endpoint listening at URI defined by the path attribute. The @WebSocketMessage identifies the method that will receive the incoming WebSocket message. This first method parameter is injected with payload of the incoming message. In this case it is assumed that the payload is text-based. It can also be of the type byte[] in case the payload is binary. A custom object may be specified if decoders attribute is specified in the @WebSocketEndpoint. This attribute will provide a list of classes that define how a custom object can be decoded. This method can also take an optional Session parameter. This is injected by the runtime and capture a conversation between two endpoints. The return type of the method can be String, byte[] or a custom object. The encoders attribute on @WebSocketEndpoint need to define how a custom object can be encoded. The client side is an index.jsp with embedded JavaScript. The JSP body looks like: <div style="text-align: center;"> <form action=""> <input onclick="say_hello()" value="Say Hello" type="button"> <input id="nameField" name="name" value="WebSocket" type="text"><br> </form> </div> <div id="output"></div> The code is relatively straight forward. It has an HTML form with a button that invokes say_hello() method and a text field named nameField. A div placeholder is available for displaying the output. Now, lets take a look at some JavaScript code: <script language="javascript" type="text/javascript"> var wsUri = "ws://localhost:8080/HelloWebSocket/hello"; var websocket = new WebSocket(wsUri); websocket.onopen = function(evt) { onOpen(evt) }; websocket.onmessage = function(evt) { onMessage(evt) }; websocket.onerror = function(evt) { onError(evt) }; function init() { output = document.getElementById("output"); } function say_hello() { websocket.send(nameField.value); writeToScreen("SENT: " + nameField.value); } This application is deployed as "HelloWebSocket.war" (download here) on GlassFish 4.0 promoted build 57. So the WebSocket endpoint is listening at "ws://localhost:8080/HelloWebSocket/hello". A new WebSocket connection is initiated by specifying the URI to connect to. The JavaScript API defines callback methods that are invoked when the connection is opened (onOpen), closed (onClose), error received (onError), or a message from the endpoint is received (onMessage). The client API has several send methods that transmit data over the connection. This particular script sends text data in the say_hello method using nameField's value from the HTML shown earlier. Each click on the button sends the textbox content to the endpoint over a WebSocket connection and receives a response based upon implementation in the sayHello method shown above. How to test this out ? Download the entire source project here or just the WAR file. Download GlassFish4.0 build 57 or later and unzip. Start GlassFish as "asadmin start-domain". Deploy the WAR file as "asadmin deploy HelloWebSocket.war". Access the application at http://localhost:8080/HelloWebSocket/index.jsp. After clicking on "Say Hello" button, the output would look like: Here are some references for you: WebSocket - Protocol and JavaScript API JSR 356: Java API for WebSocket - Specification (Early Draft) and Implementation (already integrated in GlassFish 4 promoted builds) Subsequent blogs will discuss the following topics (not necessary in that order) ... Binary data as payload Custom payloads using encoder/decoder Error handling Interface-driven WebSocket endpoint Java client API Client and Server configuration Security Subprotocols Extensions Other topics from the API Capturing WebSocket on-the-wire messages

Read the article
How to generate table of contents with prawn?

- by flagman

Is there an easy way to generate table of contents with links to corresponding pages?

Read the article
Event Log in VBA

- by Nick

Is there an easy way to write to and read from the windows event log in VBA?

Read the article
T4 Performance Counters explained

- by user13346607

Now that T4 is out for a few month some people might have wondered what details of the new pipeline you can monitor. A "cpustat -h" lists a lot of events that can be monitored, and only very few are self-explanatory. I will try to give some insight on all of them, some of these "PIC events" require an in-depth knowledge of T4 pipeline. Over time I will try to explain these, for the time being these events should simply be ignored. (Side note: some counters changed from tape-out 1.1 (*only* used in the T4 beta program) to tape-out 1.2 (used in the systems shipping today) The table only lists the tape-out 1.2 counters) 0 0 1 1058 6033 Oracle Microelectronics 50 14 7077 14.0 Normal 0 false false false EN-US JA X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;} pic name (cpustat) Prose Comment Sel-pipe-drain-cycles, Sel-0-[wait|ready], Sel-[1,2] Sel-0-wait counts cycles a strand waits to be selected. Some reasons can be counted in detail; these are: Sel-0-ready: Cycles a strand was ready but not selected, that can signal pipeline oversubscription Sel-1: Cycles only one instruction or µop was selected Sel-2: Cycles two instructions or µops were selected Sel-pipe-drain-cycles: cf. PRM footnote 8 to table 10.2 Pick-any, Pick-[0|1|2|3] Cycles one, two, three, no or at least one instruction or µop is picked Instr_FGU_crypto Number of FGU or crypto instructions executed on that vcpu Instr_ld dto. for load Instr_st dto. for store SPR_ring_ops dto. for SPR ring ops Instr_other dto. for all other instructions not listed above, PRM footnote 7 to table 10.2 lists the instructions Instr_all total number of instructions executed on that vcpu Sw_count_intr Nr of S/W count instructions on that vcpu (sethi %hi(fc000),%g0 (whatever that is)) Atomics nr of atomic ops, which are LDSTUB/a, CASA/XA, and SWAP/A SW_prefetch Nr of PREFETCH or PREFETCHA instructions Block_ld_st Block loads or store on that vcpu IC_miss_nospec, IC_miss_[L2_or_L3|local|remote]\ _hit_nospec Various I$ misses, distinguished by where they hit. All of these count per thread, but only primary events: T4 counts only the first occurence of an I$ miss on a core for a certain instruction. If one strand misses in I$ this miss is counted, but if a second strand on the same core misses while the first miss is being resolved, that second miss is not counted This flavour of I$ misses counts only misses that are caused by instruction that really commit (note the "_nospec") BTC_miss Branch target cache miss ITLB_miss ITLB misses (synchronously counted) ITLB_miss_asynch dto. but asynchronously [I|D]TLB_fill_\ [8KB|64KB|4MB|256MB|2GB|trap] H/W tablewalk events that fill ITLB or DTLB with translation for the corresponding page size. The “_trap” event occurs if the HWTW was not able to fill the corresponding TLB IC_mtag_miss, IC_mtag_miss_\ [ptag_hit|ptag_miss|\ ptag_hit_way_mismatch] I$ micro tag misses, with some options for drill down Fetch-0, Fetch-0-all fetch-0 counts nr of cycles nothing was fetched for this particular strand, fetch-0-all counts cycles nothing was fetched for all strands on a core Instr_buffer_full Cycles the instruction buffer for a strand was full, thereby preventing any fetch BTC_targ_incorrect Counts all occurences of wrongly predicted branch targets from the BTC [PQ|ROB|LB|ROB_LB|SB|\ ROB_SB|LB_SB|RB_LB_SB|\ DTLB_miss]\ _tag_wait ST_q_tag_wait is listed under sl=20. These counters monitor pipeline behaviour therefore they are not strand specific: PQ_...: cycles Rename stage waits for a Pick Queue tag (might signal memory bound workload for single thread mode, cf. Mail from Richard Smith) ROB_...: cycles Select stage waits for a ROB (ReOrderBuffer) tag LB_...: cycles Select stage waits for a Load Buffer tag SB_...: cycles Select stage waits for Store Buffer tag combinations of the above are allowed, although some of these events can overlap, the counter will only be incremented once per cycle if any of these occur DTLB_...: cycles load or store instructions wait at Pick stage for a DTLB miss tag [ID]TLB_HWTW_\ [L2_hit|L3_hit|L3_miss|all] Counters for HWTW accesses caused by either DTLB or ITLB misses. Canbe further detailed by where they hit IC_miss_L2_L3_hit, IC_miss_local_remote_remL3_hit, IC_miss I$ prefetches that were dropped because they either miss in L2$ or L3$ This variant counts misses regardless if the causing instruction commits or not DC_miss_nospec, DC_miss_[L2_L3|local|remote_L3]\ _hit_nospec D$ misses either in general or detailed by where they hit cf. the explanation for the IC_miss in two flavours for an explanation of _nospec and the reasoning for two DC_miss counters DTLB_miss_asynch counts all DTLB misses asynchronously, there is no way to count them synchronously DC_pref_drop_DC_hit, SW_pref_drop_[DC_hit|buffer_full] L1-D$ h/w prefetches that were dropped because of a D$ hit, counted per core. The others count software prefetches per strand [Full|Partial]_RAW_hit_st_[buf|q] Count events where a load wants to get data that has not yet been stored, i. e. it is still inside the pipeline. The data might be either still in the store buffer or in the store queue. If the load's data matches in the SB and in the store queue the data in buffer takes precedence of course since it is younger [IC|DC]_evict_invalid, [IC|DC|L1]_snoop_invalid, [IC|DC|L1]_invalid_all Counter for invalidated cache evictions per core St_q_tag_wait Number of cycles pipeline waits for a store queue tag, of course counted per core Data_pref_[drop_L2|drop_L3|\ hit_L2|hit_L3|\ hit_local|hit_remote] Data prefetches that can be further detailed by either why they were dropped or where they did hit St_hit_[L2|L3], St_L2_[local|remote]_C2C, St_local, St_remote Store events distinguished by where they hit or where they cause a L2 cache-to-cache transfer, i.e. either a transfer from another L2$ on the same die or from a different die DC_miss, DC_miss_\ [L2_L3|local|remote]_hit D$ misses either in general or detailed by where they hit cf. the explanation for the IC_miss in two flavours for an explanation of _nospec and the reasoning for two DC_miss counters L2_[clean|dirty]_evict Per core clean or dirty L2$ evictions L2_fill_buf_full, L2_wb_buf_full, L2_miss_buf_full Per core L2$ buffer events, all count number of cycles that this state was present L2_pipe_stall Per core cycles pipeline stalled because of L2$ Branches Count branches (Tcc, DONE, RETRY, and SIT are not counted as branches) Br_taken Counts taken branches (Tcc, DONE, RETRY, and SIT are not counted as branches) Br_mispred, Br_dir_mispred, Br_trg_mispred, Br_trg_mispred_\ [far_tbl|indir_tbl|ret_stk] Counter for various branch misprediction events. Cycles_user counts cycles, attribute setting hpriv, nouser, sys controls addess space to count in Commit-[0|1|2], Commit-0-all, Commit-1-or-2 Number of times either no, one, or two µops commit for a strand. Commit-0-all counts number of times no µop commits for the whole core, cf. footnote 11 to table 10.2 in PRM for a more detailed explanation on how this counters interacts with the privilege levels

Read the article
JSON viewer for browsing APIs

- by Myster

Does anyone have any recommendations for applications or browser plugins that make browsing and visualizing JSON APIs easy.

Read the article
Is it possible to save flv videos streamed via RTMP?

- by rick

It's easy enough to find the flv file on sites that use http to serve it up, but what about when they use RTMP?

Read the article
Continuing education with Visual Basic 9

- by TSSH

I am completely new to programming and starting my education with visual basic 9 and SQL. I have read Visual Basic, in easy steps, 2nd edition by Mike Mcgrath. Although it was very easy to understand, it only gave me an introduction to learning VB. I'm looking for something more in depth, albeit for a beginner. I've read through the questions with a VB and book tag. Unfortunately, most of what I've found here references VB 6 and VB 8. Any recommendations? I cannot afford to create a library or college at the moment. So, I need to make sure that what I do buy counts. Thanks for any suggestions!

Read the article
Am I immoral for using a variable name that differs from its type only by case?

- by Jason Baker

For instance, take this piece of code: var person = new Person(); or for you Pythonistas: person = Person() I'm told constantly how bad this is, but have yet to see an example of the immorality of these two lines of code. To me, person is a Person and trying to give it another name is a waste of time. I suppose in the days before syntax highlighting, this would have been a big deal. But these days, it's pretty easy to tell a type name apart from a variable name. Heck, it's even easy to see the difference here on SO. Or is there something I'm missing? If so, it would be helpful if you could provide an example of code that causes problems.

Read the article
How can I skip the current item and the next in a Python loop?

- by uberjumper

This might be a really dumb question, however I've looked around online, etc. And have not seen a solid answer. Is there a simple way to do something like this? lines = open('something.txt', 'r').readlines() for line in lines: if line == '!': # force iteration forward twice line.next().next() <etc> It's easy to do in C++; just increment the iterator an extra time. Is there an easy way to do that in Python? I would just like to point, out the main purpose of this question is not about "reading files and such" and skipping things. I was more looking for C++ iterator style iteration. Also the new title is kinda dumb, and i dont really think it reflects the nature of my question.

Read the article
Programatically Check if Windows Messaging Installed?

- by Phil Sandler

Is there an easy way to detect whether the messaging component is installed and the service is running in Windows using C#?

Read the article
What are some of your favorite settings in Git configuration files to make Git Fun?

- by Rachel

What are your favorite Git configuration settings which make your life easy while working with Git?

Read the article
Annoying Blank pop-up window go away

- by No Soup for YOU

Hi All, Sorry about the wording for my question title. I have a basic HTML anchor tag that when clicked it is suppose to bring up a dialog box to download a file from a differnt website. I am using an attribute of target="_blank" so that when my hyperlink is clicked, I don't navigate away from my main window. This is all the easy part (if it was so easy I wouldnt be here though). When I do the above though, and click on the hyperlink, an annoying blank window pops up with my download dialog box behind it. How do I get rid of that annoying blank window and keep only my download dialog box on the screen? Below is the HTML I'm working with... <a href="http://www.fake-domain-name.com/downloads/setup.msi" target="_blank"> <img src="images/download.png" alt="download file"/> </a>

Read the article
How to install python physics engine

- by None

I want a python physics engine that works on mac and makes it easy to simulate physics. I have VPython and it works fine, but it is not quite what I want. VPython just shows visual elements and all the physics is in formulas. I looked at the documentation for PyODE and it looked like more what I want. It allowed you to add forces to masses and have worlds and things like that. When I tried to install PyODE (I am using a Mac), it didn't work. One reason was that I didn't have pyrex (I do have Cython, so maybe there is some way to have it use that?), but the other was that I didn't have ode installed. I looked and realized that PyODE is dependent on ode. I tried to install ode but that didn't work. Is there some documentation or binary or something that makes it easy to install PyODE on a mac? Or is there a similar module?

Read the article
Memory management in Qt?

- by Martin

I'm quite new to Qt and am wondering on some basic stuff with memory management and the life of objects. When do I need to delete / destroy my objects? Is any of this handled automatically? In the example below, which of the objects I create do I need to delete? What happens to the instance variable myOtherClass when myClass is destroyed? What happens if I don't delete / destroy my objects at all, will that be a problem to memory? in MyClass.h: class MyClass { public: MyClass(); ~MyClass(); MyOtherClass *myOtherClass; }; in MyClass.cpp: MyClass::MyClass() { myOtherClass = new MyOtherClass(); MyOtherClass myOtherClass2; QString myString = "Hello"; } As you can see this is quite newbie-easy stuff but where can I learn about this in an easy way? Thanks really much

Read the article
NUMA-aware placement of communication variables

- by Dave

For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

Read the article
project organization between x86 and x64

- by Steve

I have a mixed language solution in VS2008. I would like to structure the bin such that I have bin\x86\Release and bin\x64\Release among the other options. I can do something like this if I have all C# or all C++, but as the platform names don't overlap (x86 vs. Win32) between C++ and C#, I can't seem to do this with the macros (and have an easy solution). I've tried to add a new platform and it won't let me. Is there an easy way to do this?

Read the article

< Previous Page | 120 121 122 123 124 125 126 127 128 129 130 131 | Next Page >