Search Results

Search found 39497 results on 1580 pages for 'two shoes'.

Page 171/1580 | < Previous Page | 167 168 169 170 171 172 173 174 175 176 177 178  | Next Page >

  • Iterators over a LInked List in a Game in Java

    - by Matthew
    I am using OpenGl in android and they have a callback method called draw that gets called with out my control. (As fast as the device can handle if I am not mistaken) I have a list of "GameObjects" that have a .draw method and a .update method. I have two different threads that handle each of those. So, the question is, can I declare two different iterators in two different methods in two different threads that iterate over the same Linked List? If so, do I simply declare ListIterator<GameObject> l = objets.listIterator() each time I want a new iterator and it won't interfere with other iterators?

    Read the article

  • && Operation in .NET

    - by Ram
    Which one out of the following two should be preferred while doing && operation on two values. if (!StartTime.Equals(DateTime.MinValue) && !CreationTime.Equals(DateTime.MinValue)) Or if (!(StartTime.Equals(DateTime.MinValue) && CreationTime.Equals(DateTime.MinValue)) What is the difference between the two?

    Read the article

  • Communicating Between .NET Programs

    - by IronManIngellis
    I wanted to set up a simple data communication between two C# applications, and I'm not sure what the best method is in doing so. I've previously used Java Sockets and ServerSockets to get the job done, but I'm new to C#, so I've come for advice :) It's going to be two way communication with two clients exchanging strings or something of the like.

    Read the article

  • Constructors and method chaining in JavaScript

    - by Sethen Maleno
    I am trying to make method chaining work in conjunction with my constructors, but I am not exactly sure how to go about it. Here is my code thus far: function Points(one, two, three) { this.one = one; this.two = two; this.three = three; } Points.prototype = { add: function() { return this.result = this.one + this.two + this.three; }, multiply: function() { return this.result * 30; } } var some = new Points(1, 1, 1); console.log(some.add().multiply()); I am trying to call the multiply method on the return value of the add method. I know there is something obvious that I am not doing, but I am just not sure what it is. Any thoughts?

    Read the article

  • Explain this PHP shorthand

    - by editor
    I feed like a goof but I don't entirely understand what's happening in this code: $var .= ($one || $two) ? function_one( $one, $another) : function_two( $two, $another); Does that say if $one or $two then $var is equal to fuction_one(), else function_two()? What's the purpose of using this syntax -- speed?

    Read the article

  • UITabBarController for main & about views?

    - by fuzzygoat
    I have two screens on my app that I want to present to users, the first is the main screen and the second is an about screen, brief instructions / credits. Currently I have this setup as a UITabBarController with two buttons and two views. Is this approach acceptable?

    Read the article

  • jQuery Templates and Data Linking (and Microsoft contributing to jQuery)

    - by ScottGu
    The jQuery library has a passionate community of developers, and it is now the most widely used JavaScript library on the web today. Two years ago I announced that Microsoft would begin offering product support for jQuery, and that we’d be including it in new versions of Visual Studio going forward. By default, when you create new ASP.NET Web Forms and ASP.NET MVC projects with VS 2010 you’ll find jQuery automatically added to your project. A few weeks ago during my second keynote at the MIX 2010 conference I announced that Microsoft would also begin contributing to the jQuery project.  During the talk, John Resig -- the creator of the jQuery library and leader of the jQuery developer team – talked a little about our participation and discussed an early prototype of a new client templating API for jQuery. In this blog post, I’m going to talk a little about how my team is starting to contribute to the jQuery project, and discuss some of the specific features that we are working on such as client-side templating and data linking (data-binding). Contributing to jQuery jQuery has a fantastic developer community, and a very open way to propose suggestions and make contributions.  Microsoft is following the same process to contribute to jQuery as any other member of the community. As an example, when working with the jQuery community to improve support for templating to jQuery my team followed the following steps: We created a proposal for templating and posted the proposal to the jQuery developer forum (http://forum.jquery.com/topic/jquery-templates-proposal and http://forum.jquery.com/topic/templating-syntax ). After receiving feedback on the forums, the jQuery team created a prototype for templating and posted the prototype at the Github code repository (http://github.com/jquery/jquery-tmpl ). We iterated on the prototype, creating a new fork on Github of the templating prototype, to suggest design improvements. Several other members of the community also provided design feedback by forking the templating code. There has been an amazing amount of participation by the jQuery community in response to the original templating proposal (over 100 posts in the jQuery forum), and the design of the templating proposal has evolved significantly based on community feedback. The jQuery team is the ultimate determiner on what happens with the templating proposal – they might include it in jQuery core, or make it an official plugin, or reject it entirely.  My team is excited to be able to participate in the open source process, and make suggestions and contributions the same way as any other member of the community. jQuery Template Support Client-side templates enable jQuery developers to easily generate and render HTML UI on the client.  Templates support a simple syntax that enables either developers or designers to declaratively specify the HTML they want to generate.  Developers can then programmatically invoke the templates on the client, and pass JavaScript objects to them to make the content rendered completely data driven.  These JavaScript objects can optionally be based on data retrieved from a server. Because the jQuery templating proposal is still evolving in response to community feedback, the final version might look very different than the version below. This blog post gives you a sense of how you can try out and use templating as it exists today (you can download the prototype by the jQuery core team at http://github.com/jquery/jquery-tmpl or the latest submission from my team at http://github.com/nje/jquery-tmpl).  jQuery Client Templates You create client-side jQuery templates by embedding content within a <script type="text/html"> tag.  For example, the HTML below contains a <div> template container, as well as a client-side jQuery “contactTemplate” template (within the <script type="text/html"> element) that can be used to dynamically display a list of contacts: The {{= name }} and {{= phone }} expressions are used within the contact template above to display the names and phone numbers of “contact” objects passed to the template. We can use the template to display either an array of JavaScript objects or a single object. The JavaScript code below demonstrates how you can render a JavaScript array of “contact” object using the above template. The render() method renders the data into a string and appends the string to the “contactContainer” DIV element: When the page is loaded, the list of contacts is rendered by the template.  All of this template rendering is happening on the client-side within the browser:   Templating Commands and Conditional Display Logic The current templating proposal supports a small set of template commands - including if, else, and each statements. The number of template commands was deliberately kept small to encourage people to place more complicated logic outside of their templates. Even this small set of template commands is very useful though. Imagine, for example, that each contact can have zero or more phone numbers. The contacts could be represented by the JavaScript array below: The template below demonstrates how you can use the if and each template commands to conditionally display and loop the phone numbers for each contact: If a contact has one or more phone numbers then each of the phone numbers is displayed by iterating through the phone numbers with the each template command: The jQuery team designed the template commands so that they are extensible. If you have a need for a new template command then you can easily add new template commands to the default set of commands. Support for Client Data-Linking The ASP.NET team recently submitted another proposal and prototype to the jQuery forums (http://forum.jquery.com/topic/proposal-for-adding-data-linking-to-jquery). This proposal describes a new feature named data linking. Data Linking enables you to link a property of one object to a property of another object - so that when one property changes the other property changes.  Data linking enables you to easily keep your UI and data objects synchronized within a page. If you are familiar with the concept of data-binding then you will be familiar with data linking (in the proposal, we call the feature data linking because jQuery already includes a bind() method that has nothing to do with data-binding). Imagine, for example, that you have a page with the following HTML <input> elements: The following JavaScript code links the two INPUT elements above to the properties of a JavaScript “contact” object that has a “name” and “phone” property: When you execute this code, the value of the first INPUT element (#name) is set to the value of the contact name property, and the value of the second INPUT element (#phone) is set to the value of the contact phone property. The properties of the contact object and the properties of the INPUT elements are also linked – so that changes to one are also reflected in the other. Because the contact object is linked to the INPUT element, when you request the page, the values of the contact properties are displayed: More interesting, the values of the linked INPUT elements will change automatically whenever you update the properties of the contact object they are linked to. For example, we could programmatically modify the properties of the “contact” object using the jQuery attr() method like below: Because our two INPUT elements are linked to the “contact” object, the INPUT element values will be updated automatically (without us having to write any code to modify the UI elements): Note that we updated the contact object above using the jQuery attr() method. In order for data linking to work, you must use jQuery methods to modify the property values. Two Way Linking The linkBoth() method enables two-way data linking. The contact object and INPUT elements are linked in both directions. When you modify the value of the INPUT element, the contact object is also updated automatically. For example, the following code adds a client-side JavaScript click handler to an HTML button element. When you click the button, the property values of the contact object are displayed using an alert() dialog: The following demonstrates what happens when you change the value of the Name INPUT element and click the Save button. Notice that the name property of the “contact” object that the INPUT element was linked to was updated automatically: The above example is obviously trivially simple.  Instead of displaying the new values of the contact object with a JavaScript alert, you can imagine instead calling a web-service to save the object to a database. The benefit of data linking is that it enables you to focus on your data and frees you from the mechanics of keeping your UI and data in sync. Converters The current data linking proposal also supports a feature called converters. A converter enables you to easily convert the value of a property during data linking. For example, imagine that you want to represent phone numbers in a standard way with the “contact” object phone property. In particular, you don’t want to include special characters such as ()- in the phone number - instead you only want digits and nothing else. In that case, you can wire-up a converter to convert the value of an INPUT element into this format using the code below: Notice above how a converter function is being passed to the linkFrom() method used to link the phone property of the “contact” object with the value of the phone INPUT element. This convertor function strips any non-numeric characters from the INPUT element before updating the phone property.  Now, if you enter the phone number (206) 555-9999 into the phone input field then the value 2065559999 is assigned to the phone property of the contact object: You can also use a converter in the opposite direction also. For example, you can apply a standard phone format string when displaying a phone number from a phone property. Combining Templating and Data Linking Our goal in submitting these two proposals for templating and data linking is to make it easier to work with data when building websites and applications with jQuery. Templating makes it easier to display a list of database records retrieved from a database through an Ajax call. Data linking makes it easier to keep the data and user interface in sync for update scenarios. Currently, we are working on an extension of the data linking proposal to support declarative data linking. We want to make it easy to take advantage of data linking when using a template to display data. For example, imagine that you are using the following template to display an array of product objects: Notice the {{link name}} and {{link price}} expressions. These expressions enable declarative data linking between the SPAN elements and properties of the product objects. The current jQuery templating prototype supports extending its syntax with custom template commands. In this case, we are extending the default templating syntax with a custom template command named “link”. The benefit of using data linking with the above template is that the SPAN elements will be automatically updated whenever the underlying “product” data is updated.  Declarative data linking also makes it easier to create edit and insert forms. For example, you could create a form for editing a product by using declarative data linking like this: Whenever you change the value of the INPUT elements in a template that uses declarative data linking, the underlying JavaScript data object is automatically updated. Instead of needing to write code to scrape the HTML form to get updated values, you can instead work with the underlying data directly – making your client-side code much cleaner and simpler. Downloading Working Code Examples of the Above Scenarios You can download this .zip file to get with working code examples of the above scenarios.  The .zip file includes 4 static HTML page: Listing1_Templating.htm – Illustrates basic templating. Listing2_TemplatingConditionals.htm – Illustrates templating with the use of the if and each template commands. Listing3_DataLinking.htm – Illustrates data linking. Listing4_Converters.htm – Illustrates using a converter with data linking. You can un-zip the file to the file-system and then run each page to see the concepts in action. Summary We are excited to be able to begin participating within the open-source jQuery project.  We’ve received lots of encouraging feedback in response to our first two proposals, and we will continue to actively contribute going forward.  These features will hopefully make it easier for all developers (including ASP.NET developers) to build great Ajax applications. Hope this helps, Scott P.S. [In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu]

    Read the article

  • Oracle Coherence & Oracle Service Bus: REST API Integration

    - by Nino Guarnacci
    This post aims to highlight one of the features found in Oracle Coherence which allows it to be easily added and integrated inside a wider variety of projects.  The features in question are the REST API exposed by the Coherence nodes, with which you can interact in the wider mode in memory data grid.Oracle Coherence and Oracle Service Bus are natively integrated through a feature found in the Oracle Service Bus, which allows you to use the coherence grid cache during the configuration phase of a business service. This feature allows you to use an intermediate layer of cache to retrieve the answers from previous invocations of the same service, without necessarily having to invoke the real business service again. Directly from the web console of Oracle Service Bus, you can decide the policies of eviction of the objects / answers and define the discriminating parameters that identify their uniqueness.The coherence REST APIs, however, allow you to integrate both products for other necessities enabling realization of new architectures design.  Consider coherence’s node as a simple service which interoperates through the stardard services and in particular REST (with JSON and XML). Thinking of coherence as a company’s shared service, able to have an implementation of a centralized “map and reduce” which you can access  by a huge variety of protocols (transport and envelopes).An amazing step forward for those who still imagine connectors and code. This type of integration does not require writing custom code or complex implementation to be self-supported. The added value is made unique by the incredible value of both products independently, and still more out of their simple and robust integration.As already mentioned this scenario discovers a hidden new door behind the columns of these two products. The door leads to new ideas and perspectives for enterprise architectures that increasingly wink to next-generation applications: simple and dynamic, perhaps towards the mobile and web 2.0.Below, a small and simple demo useful to demonstrate how easily is to integrate these two products using the Coherence REST API. This demo is also intended to imagine new enterprise architectures using this approach.The idea is to create a centralized system of alerting, fed easily from any company’s application, regardless of the technology with which they were built . Then use a representation standard protocol: RSS, using a service exposed by the service bus; So you can browse and search only the alerts that you are interested on, by category, author, title, date, etc etc.. The steps needed to implement this system are very simple and very few. Here they are listed below and described to be easily replicated within your environment. I would remind you that the demo is only meant to demonstrate how easily is to integrate Oracle Coherence and the Oracle Service Bus, and stimulate your imagination to new technological approaches.1) Install the two products: In this demo used (if necessary, consult the installation guides of 2 products)  - Oracle Service Bus ver. 11.1.1.5.0 http://www.oracle.com/technetwork/middleware/service-bus/downloads/index.html - Oracle Coherence ver. 3.7.1 http://www.oracle.com/technetwork/middleware/coherence/downloads/index.html 2) Because you choose to create a centralized alerting system, we need to define a structure type containing some alerting attributes useful to preserve and organize the information of the various alerts sent by the different applications. Here, then it was built a java class named Alert containing the canonical properties of an alarm information:- Title- Description- System- Time- Severity 3) Therefore, we need to create two configuration files for the coherence node, in order to save the Alert objects within the grid, through the rest/http protocol (more than the native API for Java, C + +, C,. Net). Here are the two minimal configuration files for Coherence:coherence-rest-config.xml resty-server-config.xml This minimum configuration allows me to use a distributed cache named "alerts" that can  also be accessed via http - rest on the host "localhost" over port "8080", objects are of type “oracle.cohsb.Alert”. 4) Below  a simple Java class that represents the type of alert messages: 5) At this point we just need to startup our coherence node, able to listen on http protocol to manage the “alerts” cache, which will receive incoming XML or JSON objects of type Alert. Remember to include in the classpath of the coherence node, the Alert java class and the following coherence libraries and configuration files:  At this point, just run the coherence class node “com.tangosol.net.DefaultCacheServer”advising you to set the following parameters:-Dtangosol.coherence.log.level=9 -Dtangosol.coherence.log=stdout -Dtangosol.coherence.cacheconfig=[PATH_TO_THE_FILE]\resty-server-config.xml 6) Let's create a procedure to test our configuration of Coherence and in order to insert some custom alerts in our cache. The technology with which you want to achieve this functionality is fully not considerable: Javascript, Python, Ruby, Scala, C + +, Java.... Because the protocol to communicate with Coherence is simply HTTP / JSON or XML. For this little demo i choose Java: A method to send/put the alert to the cache: A method to query and view the content of the cache: Finally the main method that execute our methods:  No special library added in the classpath for our class (json struct static defined), when it will be executed, it asks some information such as title, description,... in order to compose and send an alert to the cache and then it will perform an inquiry, to the same cache. At this point, a good exercise at this point, may be to create the same procedure using other technologies, such as a simple html page containing some JavaScript code, and then using Python, Ruby, and so on.7) Now we are ready to start configuring the Oracle Service Bus in order to integrate the two products. First integrate the internal alerting system of Oracle Service Bus with our centralized alerting system based on coherence node. This ensures that by monitoring, or directly from within our Proxy Message Flow, we can throw alerts and save them directly into the Coherence node. To do this I choose to use the jms technology, natively present inside the Oracle Weblogic / Service Bus. Access to the Oracle WebLogic Administration console and create and configure a new JMS connection factory and a new jms destination (queue). Now we should create a new resource of type “alert destination” within our Oracle Service Bus project. The new “alert destination” resource should be configured using the newly created connection factory jms and jms destination. Finally, in order to withdraw the message alert enqueued in our JMS destination and send it to our coherence node, we just need to create a new business service and proxy service within our Oracle Service Bus project.Our business service is responsible for sending a message to our REST service Coherence using as a method action: PUT Finally our proxy service have to collect all messages enqueued on the destination, execute an xquery transformation on those messages  in order to translate them into valid XML / alert objects useful to be sent to our coherence service, through the newly created business service. The message flow pipeline containing the xquery transformation: Incredibly,  we just did a basic first integration between the native alerting system of Oracle Service Bus and our centralized alerting system by simply configuring our coherence node without developing anything.It's time to test it out. To do this I create a proxy service able to generate an alert using our "alert destination", whenever the proxy is invoked. After some invocation to our proxy that generates fake alerts, we could open an Internet browser and type the URL  http://localhost: 8080/alerts/  so we could see what has been inserted within the coherence node. 8) We are ready for the final step.  We would create a new message flow, that can be used to search and display the results in standard mode. To do this I choosen the standard representation of RSS, to display a formatted result on a huge variety of devices such as readers for the iPhone and Android. The inquiry may be defined already at the time of the request able to return only feed / items related to our needs. To do this we need to create a new business service, a new proxy service, and finally a new XQuery Transformation to take care of translating the collection of alerts that will be return from our coherence node in a nicely formatted RSS standard document.So we start right from this resource (xquery), which has the task of transforming a collection of alerts / xml returned from the node coherence in a type well-formatted feed RSS 2.0 our new business service that will search the alerts on our coherence node using the Rest API. And finally, our last resource, the proxy service that will be exposed as an RSS / feeds to various mobile devices and traditional web readers, in which we will intercept any search query, and transform the result returned by the business service in an RSS feed 2.0. The message flow with the transformation phase (Alert TO Feed Items): Finally some little tricks to follow during the routing to the business service, - check for any queries present in the url to require a subset of alerts  - the http header "Accept" to help get an answer XML instead of JSON: In our little demo we also static added some coherence parameters to the request:sort=time:desc;start=0;count=100I would like to get from Coherence that the results will be sorted by date, and starting from 1 up to a maximum of 100.Done!!Just incredible, our centralized alerting system is ready. Inheriting all the qualities and capabilities of the two products involved Oracle Coherence & Oracle Service Bus: - RASP (Reliability, Availability, Scalability, Performance)Now try to use your mobile device, or a normal Internet browser by accessing the RSS just published: Some urls you may test: Search for the last 100 alerts : http://localhost:7001/alarmsSearch for alerts that do not have time set to null (time is not null):http://localhost:7001/alarms?q=time+is+not+nullSearch for alerts that the system property is “Web Browser” (system = ‘Web Browser’):http://localhost:7001/alarms?q=system+%3D+%27Web+Browser%27Search for alerts that the system property is “Web Browser” and the severity property is “Fatal” and the title property contain the word “Javascript”  (system = ‘Web Broser’ and severity = ‘Fatal’ and title like ‘%Javascript%’)http://localhost:8080/alerts?q=system+%3D+%27Web+Browser%27+AND+severity+%3D+%27Fatal%27+AND+title+LIKE+%27%25Javascript%25%27 To compose more complex queries about your need I would suggest you to read the chapter in the coherence documentation inherent the Cohl language (Coherence Query Language) http://download.oracle.com/docs/cd/E24290_01/coh.371/e22837/api_cq.htm . Some useful links: - Oracle Coherence REST API Documentation http://download.oracle.com/docs/cd/E24290_01/coh.371/e22839/rest_intro.htm - Oracle Service Bus Documentation http://download.oracle.com/docs/cd/E21764_01/soa.htm#osb - REST explanation from Wikipedia http://en.wikipedia.org/wiki/Representational_state_transfer At this URL could be downloaded the whole materials of this demo http://blogs.oracle.com/slc/resource/cosb/coh-sb-demo.zip Author: Nino Guarnacci.

    Read the article

  • Inside the Concurrent Collections: ConcurrentBag

    - by Simon Cooper
    Unlike the other concurrent collections, ConcurrentBag does not really have a non-concurrent analogy. As stated in the MSDN documentation, ConcurrentBag is optimised for the situation where the same thread is both producing and consuming items from the collection. We'll see how this is the case as we take a closer look. Again, I recommend you have ConcurrentBag open in a decompiler for reference. Thread Statics ConcurrentBag makes heavy use of thread statics - static variables marked with ThreadStaticAttribute. This is a special attribute that instructs the CLR to scope any values assigned to or read from the variable to the executing thread, not globally within the AppDomain. This means that if two different threads assign two different values to the same thread static variable, one value will not overwrite the other, and each thread will see the value they assigned to the variable, separately to any other thread. This is a very useful function that allows for ConcurrentBag's concurrency properties. You can think of a thread static variable: [ThreadStatic] private static int m_Value; as doing the same as: private static Dictionary<Thread, int> m_Values; where the executing thread's identity is used to automatically set and retrieve the corresponding value in the dictionary. In .NET 4, this usage of ThreadStaticAttribute is encapsulated in the ThreadLocal class. Lists of lists ConcurrentBag, at its core, operates as a linked list of linked lists: Each outer list node is an instance of ThreadLocalList, and each inner list node is an instance of Node. Each outer ThreadLocalList is owned by a particular thread, accessible through the thread local m_locals variable: private ThreadLocal<ThreadLocalList<T>> m_locals It is important to note that, although the m_locals variable is thread-local, that only applies to accesses through that variable. The objects referenced by the thread (each instance of the ThreadLocalList object) are normal heap objects that are not specific to any thread. Thinking back to the Dictionary analogy above, if each value stored in the dictionary could be accessed by other means, then any thread could access the value belonging to other threads using that mechanism. Only reads and writes to the variable defined as thread-local are re-routed by the CLR according to the executing thread's identity. So, although m_locals is defined as thread-local, the m_headList, m_nextList and m_tailList variables aren't. This means that any thread can access all the thread local lists in the collection by doing a linear search through the outer linked list defined by these variables. Adding items So, onto the collection operations. First, adding items. This one's pretty simple. If the current thread doesn't already own an instance of ThreadLocalList, then one is created (or, if there are lists owned by threads that have stopped, it takes control of one of those). Then the item is added to the head of that thread's list. That's it. Don't worry, it'll get more complicated when we account for the other operations on the list! Taking & Peeking items This is where it gets tricky. If the current thread's list has items in it, then it peeks or removes the head item (not the tail item) from the local list and returns that. However, if the local list is empty, it has to go and steal another item from another list, belonging to a different thread. It iterates through all the thread local lists in the collection using the m_headList and m_nextList variables until it finds one that has items in it, and it steals one item from that list. Up to this point, the two threads had been operating completely independently. To steal an item from another thread's list, the stealing thread has to do it in such a way as to not step on the owning thread's toes. Recall how adding and removing items both operate on the head of the thread's linked list? That gives us an easy way out - a thread trying to steal items from another thread can pop in round the back of another thread's list using the m_tail variable, and steal an item from the back without the owning thread knowing anything about it. The owning thread can carry on completely independently, unaware that one of its items has been nicked. However, this only works when there are at least 3 items in the list, as that guarantees there will be at least one node between the owning thread performing operations on the list head and the thread stealing items from the tail - there's no chance of the two threads operating on the same node at the same time and causing a race condition. If there's less than three items in the list, then there does need to be some synchronization between the two threads. In this case, the lock on the ThreadLocalList object is used to mediate access to a thread's list when there's the possibility of contention. Thread synchronization In ConcurrentBag, this is done using several mechanisms: Operations performed by the owner thread only take out the lock when there are less than three items in the collection. With three or greater items, there won't be any conflict with a stealing thread operating on the tail of the list. If a lock isn't taken out, the owning thread sets the list's m_currentOp variable to a non-zero value for the duration of the operation. This indicates to all other threads that there is a non-locked operation currently occuring on that list. The stealing thread always takes out the lock, to prevent two threads trying to steal from the same list at the same time. After taking out the lock, the stealing thread spinwaits until m_currentOp has been set to zero before actually performing the steal. This ensures there won't be a conflict with the owning thread when the number of items in the list is on the 2-3 item borderline. If any add or remove operations are started in the meantime, and the list is below 3 items, those operations try to take out the list's lock and are blocked until the stealing thread has finished. This allows a thread to steal an item from another thread's list without corrupting it. What about synchronization in the collection as a whole? Collection synchronization Any thread that operates on the collection's global structure (accessing anything outside the thread local lists) has to take out the collection's global lock - m_globalListsLock. This single lock is sufficient when adding a new thread local list, as the items inside each thread's list are unaffected. However, what about operations (such as Count or ToArray) that need to access every item in the collection? In order to ensure a consistent view, all operations on the collection are stopped while the count or ToArray is performed. This is done by freezing the bag at the start, performing the global operation, and unfreezing at the end: The global lock is taken out, to prevent structural alterations to the collection. m_needSync is set to true. This notifies all the threads that they need to take out their list's lock irregardless of what operation they're doing. All the list locks are taken out in order. This blocks all locking operations on the lists. The freezing thread waits for all current lockless operations to finish by spinwaiting on each m_currentOp field. The global operation can then be performed while the bag is frozen, but no other operations can take place at the same time, as all other threads are blocked on a list's lock. Then, once the global operation has finished, the locks are released, m_needSync is unset, and normal concurrent operation resumes. Concurrent principles That's the essence of how ConcurrentBag operates. Each thread operates independently on its own local list, except when they have to steal items from another list. When stealing, only the stealing thread is forced to take out the lock; the owning thread only has to when there is the possibility of contention. And a global lock controls accesses to the structure of the collection outside the thread lists. Operations affecting the entire collection take out all locks in the collection to freeze the contents at a single point in time. So, what principles can we extract here? Threads operate independently Thread-static variables and ThreadLocal makes this easy. Threads operate entirely concurrently on their own structures; only when they need to grab data from another thread is there any thread contention. Minimised lock-taking Even when two threads need to operate on the same data structures (one thread stealing from another), they do so in such a way such that the probability of actually blocking on a lock is minimised; the owning thread always operates on the head of the list, and the stealing thread always operates on the tail. Management of lockless operations Any operations that don't take out a lock still have a 'hook' to force them to lock when necessary. This allows all operations on the collection to be stopped temporarily while a global snapshot is taken. Hopefully, such operations will be short-lived and infrequent. That's all the concurrent collections covered. I hope you've found it as informative and interesting as I have. Next, I'll be taking a closer look at ThreadLocal, which I came across while analyzing ConcurrentBag. As you'll see, the operation of this class deserves a much closer look.

    Read the article

  • value types in the vm

    - by john.rose
    value types in the vm p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} p.p2 {margin: 0.0px 0.0px 14.0px 0.0px; font: 14.0px Times} p.p3 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times} p.p4 {margin: 0.0px 0.0px 15.0px 0.0px; font: 14.0px Times} p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier} p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier; min-height: 17.0px} p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p8 {margin: 0.0px 0.0px 0.0px 36.0px; text-indent: -36.0px; font: 14.0px Times; min-height: 18.0px} p.p9 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p10 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} li.li7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} span.s1 {font: 14.0px Courier} span.s2 {color: #000000} span.s3 {font: 14.0px Courier; color: #000000} ol.ol1 {list-style-type: decimal} Or, enduring values for a changing world. Introduction A value type is a data type which, generally speaking, is designed for being passed by value in and out of methods, and stored by value in data structures. The only value types which the Java language directly supports are the eight primitive types. Java indirectly and approximately supports value types, if they are implemented in terms of classes. For example, both Integer and String may be viewed as value types, especially if their usage is restricted to avoid operations appropriate to Object. In this note, we propose a definition of value types in terms of a design pattern for Java classes, accompanied by a set of usage restrictions. We also sketch the relation of such value types to tuple types (which are a JVM-level notion), and point out JVM optimizations that can apply to value types. This note is a thought experiment to extend the JVM’s performance model in support of value types. The demonstration has two phases.  Initially the extension can simply use design patterns, within the current bytecode architecture, and in today’s Java language. But if the performance model is to be realized in practice, it will probably require new JVM bytecode features, changes to the Java language, or both.  We will look at a few possibilities for these new features. An Axiom of Value In the context of the JVM, a value type is a data type equipped with construction, assignment, and equality operations, and a set of typed components, such that, whenever two variables of the value type produce equal corresponding values for their components, the values of the two variables cannot be distinguished by any JVM operation. Here are some corollaries: A value type is immutable, since otherwise a copy could be constructed and the original could be modified in one of its components, allowing the copies to be distinguished. Changing the component of a value type requires construction of a new value. The equals and hashCode operations are strictly component-wise. If a value type is represented by a JVM reference, that reference cannot be successfully synchronized on, and cannot be usefully compared for reference equality. A value type can be viewed in terms of what it doesn’t do. We can say that a value type omits all value-unsafe operations, which could violate the constraints on value types.  These operations, which are ordinarily allowed for Java object types, are pointer equality comparison (the acmp instruction), synchronization (the monitor instructions), all the wait and notify methods of class Object, and non-trivial finalize methods. The clone method is also value-unsafe, although for value types it could be treated as the identity function. Finally, and most importantly, any side effect on an object (however visible) also counts as an value-unsafe operation. A value type may have methods, but such methods must not change the components of the value. It is reasonable and useful to define methods like toString, equals, and hashCode on value types, and also methods which are specifically valuable to users of the value type. Representations of Value Value types have two natural representations in the JVM, unboxed and boxed. An unboxed value consists of the components, as simple variables. For example, the complex number x=(1+2i), in rectangular coordinate form, may be represented in unboxed form by the following pair of variables: /*Complex x = Complex.valueOf(1.0, 2.0):*/ double x_re = 1.0, x_im = 2.0; These variables might be locals, parameters, or fields. Their association as components of a single value is not defined to the JVM. Here is a sample computation which computes the norm of the difference between two complex numbers: double distance(/*Complex x:*/ double x_re, double x_im,         /*Complex y:*/ double y_re, double y_im) {     /*Complex z = x.minus(y):*/     double z_re = x_re - y_re, z_im = x_im - y_im;     /*return z.abs():*/     return Math.sqrt(z_re*z_re + z_im*z_im); } A boxed representation groups component values under a single object reference. The reference is to a ‘wrapper class’ that carries the component values in its fields. (A primitive type can naturally be equated with a trivial value type with just one component of that type. In that view, the wrapper class Integer can serve as a boxed representation of value type int.) The unboxed representation of complex numbers is practical for many uses, but it fails to cover several major use cases: return values, array elements, and generic APIs. The two components of a complex number cannot be directly returned from a Java function, since Java does not support multiple return values. The same story applies to array elements: Java has no ’array of structs’ feature. (Double-length arrays are a possible workaround for complex numbers, but not for value types with heterogeneous components.) By generic APIs I mean both those which use generic types, like Arrays.asList and those which have special case support for primitive types, like String.valueOf and PrintStream.println. Those APIs do not support unboxed values, and offer some problems to boxed values. Any ’real’ JVM type should have a story for returns, arrays, and API interoperability. The basic problem here is that value types fall between primitive types and object types. Value types are clearly more complex than primitive types, and object types are slightly too complicated. Objects are a little bit dangerous to use as value carriers, since object references can be compared for pointer equality, and can be synchronized on. Also, as many Java programmers have observed, there is often a performance cost to using wrapper objects, even on modern JVMs. Even so, wrapper classes are a good starting point for talking about value types. If there were a set of structural rules and restrictions which would prevent value-unsafe operations on value types, wrapper classes would provide a good notation for defining value types. This note attempts to define such rules and restrictions. Let’s Start Coding Now it is time to look at some real code. Here is a definition, written in Java, of a complex number value type. @ValueSafe public final class Complex implements java.io.Serializable {     // immutable component structure:     public final double re, im;     private Complex(double re, double im) {         this.re = re; this.im = im;     }     // interoperability methods:     public String toString() { return "Complex("+re+","+im+")"; }     public List<Double> asList() { return Arrays.asList(re, im); }     public boolean equals(Complex c) {         return re == c.re && im == c.im;     }     public boolean equals(@ValueSafe Object x) {         return x instanceof Complex && equals((Complex) x);     }     public int hashCode() {         return 31*Double.valueOf(re).hashCode()                 + Double.valueOf(im).hashCode();     }     // factory methods:     public static Complex valueOf(double re, double im) {         return new Complex(re, im);     }     public Complex changeRe(double re2) { return valueOf(re2, im); }     public Complex changeIm(double im2) { return valueOf(re, im2); }     public static Complex cast(@ValueSafe Object x) {         return x == null ? ZERO : (Complex) x;     }     // utility methods and constants:     public Complex plus(Complex c)  { return new Complex(re+c.re, im+c.im); }     public Complex minus(Complex c) { return new Complex(re-c.re, im-c.im); }     public double abs() { return Math.sqrt(re*re + im*im); }     public static final Complex PI = valueOf(Math.PI, 0.0);     public static final Complex ZERO = valueOf(0.0, 0.0); } This is not a minimal definition, because it includes some utility methods and other optional parts.  The essential elements are as follows: The class is marked as a value type with an annotation. The class is final, because it does not make sense to create subclasses of value types. The fields of the class are all non-private and final.  (I.e., the type is immutable and structurally transparent.) From the supertype Object, all public non-final methods are overridden. The constructor is private. Beyond these bare essentials, we can observe the following features in this example, which are likely to be typical of all value types: One or more factory methods are responsible for value creation, including a component-wise valueOf method. There are utility methods for complex arithmetic and instance creation, such as plus and changeIm. There are static utility constants, such as PI. The type is serializable, using the default mechanisms. There are methods for converting to and from dynamically typed references, such as asList and cast. The Rules In order to use value types properly, the programmer must avoid value-unsafe operations.  A helpful Java compiler should issue errors (or at least warnings) for code which provably applies value-unsafe operations, and should issue warnings for code which might be correct but does not provably avoid value-unsafe operations.  No such compilers exist today, but to simplify our account here, we will pretend that they do exist. A value-safe type is any class, interface, or type parameter marked with the @ValueSafe annotation, or any subtype of a value-safe type.  If a value-safe class is marked final, it is in fact a value type.  All other value-safe classes must be abstract.  The non-static fields of a value class must be non-public and final, and all its constructors must be private. Under the above rules, a standard interface could be helpful to define value types like Complex.  Here is an example: @ValueSafe public interface ValueType extends java.io.Serializable {     // All methods listed here must get redefined.     // Definitions must be value-safe, which means     // they may depend on component values only.     List<? extends Object> asList();     int hashCode();     boolean equals(@ValueSafe Object c);     String toString(); } //@ValueSafe inherited from supertype: public final class Complex implements ValueType { … The main advantage of such a conventional interface is that (unlike an annotation) it is reified in the runtime type system.  It could appear as an element type or parameter bound, for facilities which are designed to work on value types only.  More broadly, it might assist the JVM to perform dynamic enforcement of the rules for value types. Besides types, the annotation @ValueSafe can mark fields, parameters, local variables, and methods.  (This is redundant when the type is also value-safe, but may be useful when the type is Object or another supertype of a value type.)  Working forward from these annotations, an expression E is defined as value-safe if it satisfies one or more of the following: The type of E is a value-safe type. E names a field, parameter, or local variable whose declaration is marked @ValueSafe. E is a call to a method whose declaration is marked @ValueSafe. E is an assignment to a value-safe variable, field reference, or array reference. E is a cast to a value-safe type from a value-safe expression. E is a conditional expression E0 ? E1 : E2, and both E1 and E2 are value-safe. Assignments to value-safe expressions and initializations of value-safe names must take their values from value-safe expressions. A value-safe expression may not be the subject of a value-unsafe operation.  In particular, it cannot be synchronized on, nor can it be compared with the “==” operator, not even with a null or with another value-safe type. In a program where all of these rules are followed, no value-type value will be subject to a value-unsafe operation.  Thus, the prime axiom of value types will be satisfied, that no two value type will be distinguishable as long as their component values are equal. More Code To illustrate these rules, here are some usage examples for Complex: Complex pi = Complex.valueOf(Math.PI, 0); Complex zero = pi.changeRe(0);  //zero = pi; zero.re = 0; ValueType vtype = pi; @SuppressWarnings("value-unsafe")   Object obj = pi; @ValueSafe Object obj2 = pi; obj2 = new Object();  // ok List<Complex> clist = new ArrayList<Complex>(); clist.add(pi);  // (ok assuming List.add param is @ValueSafe) List<ValueType> vlist = new ArrayList<ValueType>(); vlist.add(pi);  // (ok) List<Object> olist = new ArrayList<Object>(); olist.add(pi);  // warning: "value-unsafe" boolean z = pi.equals(zero); boolean z1 = (pi == zero);  // error: reference comparison on value type boolean z2 = (pi == null);  // error: reference comparison on value type boolean z3 = (pi == obj2);  // error: reference comparison on value type synchronized (pi) { }  // error: synch of value, unpredictable result synchronized (obj2) { }  // unpredictable result Complex qq = pi; qq = null;  // possible NPE; warning: “null-unsafe" qq = (Complex) obj;  // warning: “null-unsafe" qq = Complex.cast(obj);  // OK @SuppressWarnings("null-unsafe")   Complex empty = null;  // possible NPE qq = empty;  // possible NPE (null pollution) The Payoffs It follows from this that either the JVM or the java compiler can replace boxed value-type values with unboxed ones, without affecting normal computations.  Fields and variables of value types can be split into their unboxed components.  Non-static methods on value types can be transformed into static methods which take the components as value parameters. Some common questions arise around this point in any discussion of value types. Why burden the programmer with all these extra rules?  Why not detect programs automagically and perform unboxing transparently?  The answer is that it is easy to break the rules accidently unless they are agreed to by the programmer and enforced.  Automatic unboxing optimizations are tantalizing but (so far) unreachable ideal.  In the current state of the art, it is possible exhibit benchmarks in which automatic unboxing provides the desired effects, but it is not possible to provide a JVM with a performance model that assures the programmer when unboxing will occur.  This is why I’m writing this note, to enlist help from, and provide assurances to, the programmer.  Basically, I’m shooting for a good set of user-supplied “pragmas” to frame the desired optimization. Again, the important thing is that the unboxing must be done reliably, or else programmers will have no reason to work with the extra complexity of the value-safety rules.  There must be a reasonably stable performance model, wherein using a value type has approximately the same performance characteristics as writing the unboxed components as separate Java variables. There are some rough corners to the present scheme.  Since Java fields and array elements are initialized to null, value-type computations which incorporate uninitialized variables can produce null pointer exceptions.  One workaround for this is to require such variables to be null-tested, and the result replaced with a suitable all-zero value of the value type.  That is what the “cast” method does above. Generically typed APIs like List<T> will continue to manipulate boxed values always, at least until we figure out how to do reification of generic type instances.  Use of such APIs will elicit warnings until their type parameters (and/or relevant members) are annotated or typed as value-safe.  Retrofitting List<T> is likely to expose flaws in the present scheme, which we will need to engineer around.  Here are a couple of first approaches: public interface java.util.List<@ValueSafe T> extends Collection<T> { … public interface java.util.List<T extends Object|ValueType> extends Collection<T> { … (The second approach would require disjunctive types, in which value-safety is “contagious” from the constituent types.) With more transformations, the return value types of methods can also be unboxed.  This may require significant bytecode-level transformations, and would work best in the presence of a bytecode representation for multiple value groups, which I have proposed elsewhere under the title “Tuples in the VM”. But for starters, the JVM can apply this transformation under the covers, to internally compiled methods.  This would give a way to express multiple return values and structured return values, which is a significant pain-point for Java programmers, especially those who work with low-level structure types favored by modern vector and graphics processors.  The lack of multiple return values has a strong distorting effect on many Java APIs. Even if the JVM fails to unbox a value, there is still potential benefit to the value type.  Clustered computing systems something have copy operations (serialization or something similar) which apply implicitly to command operands.  When copying JVM objects, it is extremely helpful to know when an object’s identity is important or not.  If an object reference is a copied operand, the system may have to create a proxy handle which points back to the original object, so that side effects are visible.  Proxies must be managed carefully, and this can be expensive.  On the other hand, value types are exactly those types which a JVM can “copy and forget” with no downside. Array types are crucial to bulk data interfaces.  (As data sizes and rates increase, bulk data becomes more important than scalar data, so arrays are definitely accompanying us into the future of computing.)  Value types are very helpful for adding structure to bulk data, so a successful value type mechanism will make it easier for us to express richer forms of bulk data. Unboxing arrays (i.e., arrays containing unboxed values) will provide better cache and memory density, and more direct data movement within clustered or heterogeneous computing systems.  They require the deepest transformations, relative to today’s JVM.  There is an impedance mismatch between value-type arrays and Java’s covariant array typing, so compromises will need to be struck with existing Java semantics.  It is probably worth the effort, since arrays of unboxed value types are inherently more memory-efficient than standard Java arrays, which rely on dependent pointer chains. It may be sufficient to extend the “value-safe” concept to array declarations, and allow low-level transformations to change value-safe array declarations from the standard boxed form into an unboxed tuple-based form.  Such value-safe arrays would not be convertible to Object[] arrays.  Certain connection points, such as Arrays.copyOf and System.arraycopy might need additional input/output combinations, to allow smooth conversion between arrays with boxed and unboxed elements. Alternatively, the correct solution may have to wait until we have enough reification of generic types, and enough operator overloading, to enable an overhaul of Java arrays. Implicit Method Definitions The example of class Complex above may be unattractively complex.  I believe most or all of the elements of the example class are required by the logic of value types. If this is true, a programmer who writes a value type will have to write lots of error-prone boilerplate code.  On the other hand, I think nearly all of the code (except for the domain-specific parts like plus and minus) can be implicitly generated. Java has a rule for implicitly defining a class’s constructor, if no it defines no constructors explicitly.  Likewise, there are rules for providing default access modifiers for interface members.  Because of the highly regular structure of value types, it might be reasonable to perform similar implicit transformations on value types.  Here’s an example of a “highly implicit” definition of a complex number type: public class Complex implements ValueType {  // implicitly final     public double re, im;  // implicitly public final     //implicit methods are defined elementwise from te fields:     //  toString, asList, equals(2), hashCode, valueOf, cast     //optionally, explicit methods (plus, abs, etc.) would go here } In other words, with the right defaults, a simple value type definition can be a one-liner.  The observant reader will have noticed the similarities (and suitable differences) between the explicit methods above and the corresponding methods for List<T>. Another way to abbreviate such a class would be to make an annotation the primary trigger of the functionality, and to add the interface(s) implicitly: public @ValueType class Complex { … // implicitly final, implements ValueType (But to me it seems better to communicate the “magic” via an interface, even if it is rooted in an annotation.) Implicitly Defined Value Types So far we have been working with nominal value types, which is to say that the sequence of typed components is associated with a name and additional methods that convey the intention of the programmer.  A simple ordered pair of floating point numbers can be variously interpreted as (to name a few possibilities) a rectangular or polar complex number or Cartesian point.  The name and the methods convey the intended meaning. But what if we need a truly simple ordered pair of floating point numbers, without any further conceptual baggage?  Perhaps we are writing a method (like “divideAndRemainder”) which naturally returns a pair of numbers instead of a single number.  Wrapping the pair of numbers in a nominal type (like “QuotientAndRemainder”) makes as little sense as wrapping a single return value in a nominal type (like “Quotient”).  What we need here are structural value types commonly known as tuples. For the present discussion, let us assign a conventional, JVM-friendly name to tuples, roughly as follows: public class java.lang.tuple.$DD extends java.lang.tuple.Tuple {      double $1, $2; } Here the component names are fixed and all the required methods are defined implicitly.  The supertype is an abstract class which has suitable shared declarations.  The name itself mentions a JVM-style method parameter descriptor, which may be “cracked” to determine the number and types of the component fields. The odd thing about such a tuple type (and structural types in general) is it must be instantiated lazily, in response to linkage requests from one or more classes that need it.  The JVM and/or its class loaders must be prepared to spin a tuple type on demand, given a simple name reference, $xyz, where the xyz is cracked into a series of component types.  (Specifics of naming and name mangling need some tasteful engineering.) Tuples also seem to demand, even more than nominal types, some support from the language.  (This is probably because notations for non-nominal types work best as combinations of punctuation and type names, rather than named constructors like Function3 or Tuple2.)  At a minimum, languages with tuples usually (I think) have some sort of simple bracket notation for creating tuples, and a corresponding pattern-matching syntax (or “destructuring bind”) for taking tuples apart, at least when they are parameter lists.  Designing such a syntax is no simple thing, because it ought to play well with nominal value types, and also with pre-existing Java features, such as method parameter lists, implicit conversions, generic types, and reflection.  That is a task for another day. Other Use Cases Besides complex numbers and simple tuples there are many use cases for value types.  Many tuple-like types have natural value-type representations. These include rational numbers, point locations and pixel colors, and various kinds of dates and addresses. Other types have a variable-length ‘tail’ of internal values. The most common example of this is String, which is (mathematically) a sequence of UTF-16 character values. Similarly, bit vectors, multiple-precision numbers, and polynomials are composed of sequences of values. Such types include, in their representation, a reference to a variable-sized data structure (often an array) which (somehow) represents the sequence of values. The value type may also include ’header’ information. Variable-sized values often have a length distribution which favors short lengths. In that case, the design of the value type can make the first few values in the sequence be direct ’header’ fields of the value type. In the common case where the header is enough to represent the whole value, the tail can be a shared null value, or even just a null reference. Note that the tail need not be an immutable object, as long as the header type encapsulates it well enough. This is the case with String, where the tail is a mutable (but never mutated) character array. Field types and their order must be a globally visible part of the API.  The structure of the value type must be transparent enough to have a globally consistent unboxed representation, so that all callers and callees agree about the type and order of components  that appear as parameters, return types, and array elements.  This is a trade-off between efficiency and encapsulation, which is forced on us when we remove an indirection enjoyed by boxed representations.  A JVM-only transformation would not care about such visibility, but a bytecode transformation would need to take care that (say) the components of complex numbers would not get swapped after a redefinition of Complex and a partial recompile.  Perhaps constant pool references to value types need to declare the field order as assumed by each API user. This brings up the delicate status of private fields in a value type.  It must always be possible to load, store, and copy value types as coordinated groups, and the JVM performs those movements by moving individual scalar values between locals and stack.  If a component field is not public, what is to prevent hostile code from plucking it out of the tuple using a rogue aload or astore instruction?  Nothing but the verifier, so we may need to give it more smarts, so that it treats value types as inseparable groups of stack slots or locals (something like long or double). My initial thought was to make the fields always public, which would make the security problem moot.  But public is not always the right answer; consider the case of String, where the underlying mutable character array must be encapsulated to prevent security holes.  I believe we can win back both sides of the tradeoff, by training the verifier never to split up the components in an unboxed value.  Just as the verifier encapsulates the two halves of a 64-bit primitive, it can encapsulate the the header and body of an unboxed String, so that no code other than that of class String itself can take apart the values. Similar to String, we could build an efficient multi-precision decimal type along these lines: public final class DecimalValue extends ValueType {     protected final long header;     protected private final BigInteger digits;     public DecimalValue valueOf(int value, int scale) {         assert(scale >= 0);         return new DecimalValue(((long)value << 32) + scale, null);     }     public DecimalValue valueOf(long value, int scale) {         if (value == (int) value)             return valueOf((int)value, scale);         return new DecimalValue(-scale, new BigInteger(value));     } } Values of this type would be passed between methods as two machine words. Small values (those with a significand which fits into 32 bits) would be represented without any heap data at all, unless the DecimalValue itself were boxed. (Note the tension between encapsulation and unboxing in this case.  It would be better if the header and digits fields were private, but depending on where the unboxing information must “leak”, it is probably safer to make a public revelation of the internal structure.) Note that, although an array of Complex can be faked with a double-length array of double, there is no easy way to fake an array of unboxed DecimalValues.  (Either an array of boxed values or a transposed pair of homogeneous arrays would be reasonable fallbacks, in a current JVM.)  Getting the full benefit of unboxing and arrays will require some new JVM magic. Although the JVM emphasizes portability, system dependent code will benefit from using machine-level types larger than 64 bits.  For example, the back end of a linear algebra package might benefit from value types like Float4 which map to stock vector types.  This is probably only worthwhile if the unboxing arrays can be packed with such values. More Daydreams A more finely-divided design for dynamic enforcement of value safety could feature separate marker interfaces for each invariant.  An empty marker interface Unsynchronizable could cause suitable exceptions for monitor instructions on objects in marked classes.  More radically, a Interchangeable marker interface could cause JVM primitives that are sensitive to object identity to raise exceptions; the strangest result would be that the acmp instruction would have to be specified as raising an exception. @ValueSafe public interface ValueType extends java.io.Serializable,         Unsynchronizable, Interchangeable { … public class Complex implements ValueType {     // inherits Serializable, Unsynchronizable, Interchangeable, @ValueSafe     … It seems possible that Integer and the other wrapper types could be retro-fitted as value-safe types.  This is a major change, since wrapper objects would be unsynchronizable and their references interchangeable.  It is likely that code which violates value-safety for wrapper types exists but is uncommon.  It is less plausible to retro-fit String, since the prominent operation String.intern is often used with value-unsafe code. We should also reconsider the distinction between boxed and unboxed values in code.  The design presented above obscures that distinction.  As another thought experiment, we could imagine making a first class distinction in the type system between boxed and unboxed representations.  Since only primitive types are named with a lower-case initial letter, we could define that the capitalized version of a value type name always refers to the boxed representation, while the initial lower-case variant always refers to boxed.  For example: complex pi = complex.valueOf(Math.PI, 0); Complex boxPi = pi;  // convert to boxed myList.add(boxPi); complex z = myList.get(0);  // unbox Such a convention could perhaps absorb the current difference between int and Integer, double and Double. It might also allow the programmer to express a helpful distinction among array types. As said above, array types are crucial to bulk data interfaces, but are limited in the JVM.  Extending arrays beyond the present limitations is worth thinking about; for example, the Maxine JVM implementation has a hybrid object/array type.  Something like this which can also accommodate value type components seems worthwhile.  On the other hand, does it make sense for value types to contain short arrays?  And why should random-access arrays be the end of our design process, when bulk data is often sequentially accessed, and it might make sense to have heterogeneous streams of data as the natural “jumbo” data structure.  These considerations must wait for another day and another note. More Work It seems to me that a good sequence for introducing such value types would be as follows: Add the value-safety restrictions to an experimental version of javac. Code some sample applications with value types, including Complex and DecimalValue. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so.  Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. A staggered roll-out like this would decouple language changes from bytecode changes, which is always a convenient thing. A similar investigation should be applied (concurrently) to array types.  In this case, it seems to me that the starting point is in the JVM: Add an experimental unboxing array data structure to a production JVM, perhaps along the lines of Maxine hybrids.  No bytecode or language support is required at first; everything can be done with encapsulated unsafe operations and/or method handles. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so.  Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. That’s enough musing me for now.  Back to work!

    Read the article

  • Hello Operator, My Switch Is Bored

    - by Paul White
    This is a post for T-SQL Tuesday #43 hosted by my good friend Rob Farley. The topic this month is Plan Operators. I haven’t taken part in T-SQL Tuesday before, but I do like to write about execution plans, so this seemed like a good time to start. This post is in two parts. The first part is primarily an excuse to use a pretty bad play on words in the title of this blog post (if you’re too young to know what a telephone operator or a switchboard is, I hate you). The second part of the post looks at an invisible query plan operator (so to speak). 1. My Switch Is Bored Allow me to present the rare and interesting execution plan operator, Switch: Books Online has this to say about Switch: Following that description, I had a go at producing a Fast Forward Cursor plan that used the TOP operator, but had no luck. That may be due to my lack of skill with cursors, I’m not too sure. The only application of Switch in SQL Server 2012 that I am familiar with requires a local partitioned view: CREATE TABLE dbo.T1 (c1 int NOT NULL CHECK (c1 BETWEEN 00 AND 24)); CREATE TABLE dbo.T2 (c1 int NOT NULL CHECK (c1 BETWEEN 25 AND 49)); CREATE TABLE dbo.T3 (c1 int NOT NULL CHECK (c1 BETWEEN 50 AND 74)); CREATE TABLE dbo.T4 (c1 int NOT NULL CHECK (c1 BETWEEN 75 AND 99)); GO CREATE VIEW V1 AS SELECT c1 FROM dbo.T1 UNION ALL SELECT c1 FROM dbo.T2 UNION ALL SELECT c1 FROM dbo.T3 UNION ALL SELECT c1 FROM dbo.T4; Not only that, but it needs an updatable local partitioned view. We’ll need some primary keys to meet that requirement: ALTER TABLE dbo.T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (c1);   ALTER TABLE dbo.T2 ADD CONSTRAINT PK_T2 PRIMARY KEY (c1);   ALTER TABLE dbo.T3 ADD CONSTRAINT PK_T3 PRIMARY KEY (c1);   ALTER TABLE dbo.T4 ADD CONSTRAINT PK_T4 PRIMARY KEY (c1); We also need an INSERT statement that references the view. Even more specifically, to see a Switch operator, we need to perform a single-row insert (multi-row inserts use a different plan shape): INSERT dbo.V1 (c1) VALUES (1); And now…the execution plan: The Constant Scan manufactures a single row with no columns. The Compute Scalar works out which partition of the view the new value should go in. The Assert checks that the computed partition number is not null (if it is, an error is returned). The Nested Loops Join executes exactly once, with the partition id as an outer reference (correlated parameter). The Switch operator checks the value of the parameter and executes the corresponding input only. If the partition id is 0, the uppermost Clustered Index Insert is executed, adding a row to table T1. If the partition id is 1, the next lower Clustered Index Insert is executed, adding a row to table T2…and so on. In case you were wondering, here’s a query and execution plan for a multi-row insert to the view: INSERT dbo.V1 (c1) VALUES (1), (2); Yuck! An Eager Table Spool and four Filters! I prefer the Switch plan. My guess is that almost all the old strategies that used a Switch operator have been replaced over time, using things like a regular Concatenation Union All combined with Start-Up Filters on its inputs. Other new (relative to the Switch operator) features like table partitioning have specific execution plan support that doesn’t need the Switch operator either. This feels like a bit of a shame, but perhaps it is just nostalgia on my part, it’s hard to know. Please do let me know if you encounter a query that can still use the Switch operator in 2012 – it must be very bored if this is the only possible modern usage! 2. Invisible Plan Operators The second part of this post uses an example based on a question Dave Ballantyne asked using the SQL Sentry Plan Explorer plan upload facility. If you haven’t tried that yet, make sure you’re on the latest version of the (free) Plan Explorer software, and then click the Post to SQLPerformance.com button. That will create a site question with the query plan attached (which can be anonymized if the plan contains sensitive information). Aaron Bertrand and I keep a close eye on questions there, so if you have ever wanted to ask a query plan question of either of us, that’s a good way to do it. The problem The issue I want to talk about revolves around a query issued against a calendar table. The script below creates a simplified version and adds 100 years of per-day information to it: USE tempdb; GO CREATE TABLE dbo.Calendar ( dt date NOT NULL, isWeekday bit NOT NULL, theYear smallint NOT NULL,   CONSTRAINT PK__dbo_Calendar_dt PRIMARY KEY CLUSTERED (dt) ); GO -- Monday is the first day of the week for me SET DATEFIRST 1;   -- Add 100 years of data INSERT dbo.Calendar WITH (TABLOCKX) (dt, isWeekday, theYear) SELECT CA.dt, isWeekday = CASE WHEN DATEPART(WEEKDAY, CA.dt) IN (6, 7) THEN 0 ELSE 1 END, theYear = YEAR(CA.dt) FROM Sandpit.dbo.Numbers AS N CROSS APPLY ( VALUES (DATEADD(DAY, N.n - 1, CONVERT(date, '01 Jan 2000', 113))) ) AS CA (dt) WHERE N.n BETWEEN 1 AND 36525; The following query counts the number of weekend days in 2013: SELECT Days = COUNT_BIG(*) FROM dbo.Calendar AS C WHERE theYear = 2013 AND isWeekday = 0; It returns the correct result (104) using the following execution plan: The query optimizer has managed to estimate the number of rows returned from the table exactly, based purely on the default statistics created separately on the two columns referenced in the query’s WHERE clause. (Well, almost exactly, the unrounded estimate is 104.289 rows.) There is already an invisible operator in this query plan – a Filter operator used to apply the WHERE clause predicates. We can see it by re-running the query with the enormously useful (but undocumented) trace flag 9130 enabled: Now we can see the full picture. The whole table is scanned, returning all 36,525 rows, before the Filter narrows that down to just the 104 we want. Without the trace flag, the Filter is incorporated in the Clustered Index Scan as a residual predicate. It is a little bit more efficient than using a separate operator, but residual predicates are still something you will want to avoid where possible. The estimates are still spot on though: Anyway, looking to improve the performance of this query, Dave added the following filtered index to the Calendar table: CREATE NONCLUSTERED INDEX Weekends ON dbo.Calendar(theYear) WHERE isWeekday = 0; The original query now produces a much more efficient plan: Unfortunately, the estimated number of rows produced by the seek is now wrong (365 instead of 104): What’s going on? The estimate was spot on before we added the index! Explanation You might want to grab a coffee for this bit. Using another trace flag or two (8606 and 8612) we can see that the cardinality estimates were exactly right initially: The highlighted information shows the initial cardinality estimates for the base table (36,525 rows), the result of applying the two relational selects in our WHERE clause (104 rows), and after performing the COUNT_BIG(*) group by aggregate (1 row). All of these are correct, but that was before cost-based optimization got involved :) Cost-based optimization When cost-based optimization starts up, the logical tree above is copied into a structure (the ‘memo’) that has one group per logical operation (roughly speaking). The logical read of the base table (LogOp_Get) ends up in group 7; the two predicates (LogOp_Select) end up in group 8 (with the details of the selections in subgroups 0-6). These two groups still have the correct cardinalities as trace flag 8608 output (initial memo contents) shows: During cost-based optimization, a rule called SelToIdxStrategy runs on group 8. It’s job is to match logical selections to indexable expressions (SARGs). It successfully matches the selections (theYear = 2013, is Weekday = 0) to the filtered index, and writes a new alternative into the memo structure. The new alternative is entered into group 8 as option 1 (option 0 was the original LogOp_Select): The new alternative is to do nothing (PhyOp_NOP = no operation), but to instead follow the new logical instructions listed below the NOP. The LogOp_GetIdx (full read of an index) goes into group 21, and the LogOp_SelectIdx (selection on an index) is placed in group 22, operating on the result of group 21. The definition of the comparison ‘the Year = 2013’ (ScaOp_Comp downwards) was already present in the memo starting at group 2, so no new memo groups are created for that. New Cardinality Estimates The new memo groups require two new cardinality estimates to be derived. First, LogOp_Idx (full read of the index) gets a predicted cardinality of 10,436. This number comes from the filtered index statistics: DBCC SHOW_STATISTICS (Calendar, Weekends) WITH STAT_HEADER; The second new cardinality derivation is for the LogOp_SelectIdx applying the predicate (theYear = 2013). To get a number for this, the cardinality estimator uses statistics for the column ‘theYear’, producing an estimate of 365 rows (there are 365 days in 2013!): DBCC SHOW_STATISTICS (Calendar, theYear) WITH HISTOGRAM; This is where the mistake happens. Cardinality estimation should have used the filtered index statistics here, to get an estimate of 104 rows: DBCC SHOW_STATISTICS (Calendar, Weekends) WITH HISTOGRAM; Unfortunately, the logic has lost sight of the link between the read of the filtered index (LogOp_GetIdx) in group 22, and the selection on that index (LogOp_SelectIdx) that it is deriving a cardinality estimate for, in group 21. The correct cardinality estimate (104 rows) is still present in the memo, attached to group 8, but that group now has a PhyOp_NOP implementation. Skipping over the rest of cost-based optimization (in a belated attempt at brevity) we can see the optimizer’s final output using trace flag 8607: This output shows the (incorrect, but understandable) 365 row estimate for the index range operation, and the correct 104 estimate still attached to its PhyOp_NOP. This tree still has to go through a few post-optimizer rewrites and ‘copy out’ from the memo structure into a tree suitable for the execution engine. One step in this process removes PhyOp_NOP, discarding its 104-row cardinality estimate as it does so. To finish this section on a more positive note, consider what happens if we add an OVER clause to the query aggregate. This isn’t intended to be a ‘fix’ of any sort, I just want to show you that the 104 estimate can survive and be used if later cardinality estimation needs it: SELECT Days = COUNT_BIG(*) OVER () FROM dbo.Calendar AS C WHERE theYear = 2013 AND isWeekday = 0; The estimated execution plan is: Note the 365 estimate at the Index Seek, but the 104 lives again at the Segment! We can imagine the lost predicate ‘isWeekday = 0’ as sitting between the seek and the segment in an invisible Filter operator that drops the estimate from 365 to 104. Even though the NOP group is removed after optimization (so we don’t see it in the execution plan) bear in mind that all cost-based choices were made with the 104-row memo group present, so although things look a bit odd, it shouldn’t affect the optimizer’s plan selection. I should also mention that we can work around the estimation issue by including the index’s filtering columns in the index key: CREATE NONCLUSTERED INDEX Weekends ON dbo.Calendar(theYear, isWeekday) WHERE isWeekday = 0 WITH (DROP_EXISTING = ON); There are some downsides to doing this, including that changes to the isWeekday column may now require Halloween Protection, but that is unlikely to be a big problem for a static calendar table ;)  With the updated index in place, the original query produces an execution plan with the correct cardinality estimation showing at the Index Seek: That’s all for today, remember to let me know about any Switch plans you come across on a modern instance of SQL Server! Finally, here are some other posts of mine that cover other plan operators: Segment and Sequence Project Common Subexpression Spools Why Plan Operators Run Backwards Row Goals and the Top Operator Hash Match Flow Distinct Top N Sort Index Spools and Page Splits Singleton and Range Seeks Bitmaps Hash Join Performance Compute Scalar © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • Making a Statement: How to retrieve the T-SQL statement that caused an event

    - by extended_events
    If you’ve done any troubleshooting of T-SQL, you know that sooner or later, probably sooner, you’re going to want to take a look at the actual statements you’re dealing with. In extended events we offer an action (See the BOL topic that covers Extended Events Objects for a description of actions) named sql_text that seems like it is just the ticket. Well…not always – sounds like a good reason for a blog post. When is a statement not THE statement? The sql_text action returns the same information that is returned from DBCC INPUTBUFFER, which may or may not be what you want. For example, if you execute a stored procedure, the sql_text action will return something along the lines of “EXEC sp_notwhatiwanted” assuming that is the statement you sent from the client. Often times folks would like something more specific, like the actual statements that are being run from within the stored procedure or batch. Enter the stack Extended events offers another action, this one with the descriptive name of tsql_stack, that includes the sql_handle and offset information about the statements being run when an event occurs. With the sql_handle and offset values you can retrieve the specific statement you seek using the DMV dm_exec_sql_statement. The BOL topic for dm_exec_sql_statement provides an example for how to extract this information, so I’ll cover the gymnastics required to get the sql_handle and offset values out of the tsql_stack data collected by the action. I’m the first to admit that this isn’t pretty, but this is what we have in SQL Server 2008 and 2008 R2. We will be making it easier to get statement level information in the next major release of SQL Server. The sample code For this example I have a stored procedure that includes multiple statements and I have a need to differentiate between those two statements in my tracing. I’m going to track two events: module_end tracks the completion of the stored procedure execution and sp_statement_completed tracks the execution of each statement within a stored procedure. I’m adding the tsql_stack events (since that’s the topic of this post) and the sql_text action for comparison sake. (If you have questions about creating event sessions, check out Pedro’s post Introduction to Extended Events.) USE AdventureWorks2008GO -- Test SPCREATE PROCEDURE sp_multiple_statementsASSELECT 'This is the first statement'SELECT 'this is the second statement'GO -- Create a session to look at the spCREATE EVENT SESSION track_sprocs ON SERVERADD EVENT sqlserver.module_end (ACTION (sqlserver.tsql_stack, sqlserver.sql_text)),ADD EVENT sqlserver.sp_statement_completed (ACTION (sqlserver.tsql_stack, sqlserver.sql_text))ADD TARGET package0.ring_bufferWITH (MAX_DISPATCH_LATENCY = 1 SECONDS)GO -- Start the sessionALTER EVENT SESSION track_sprocs ON SERVERSTATE = STARTGO -- Run the test procedureEXEC sp_multiple_statementsGO -- Stop collection of events but maintain ring bufferALTER EVENT SESSION track_sprocs ON SERVERDROP EVENT sqlserver.module_end,DROP EVENT sqlserver.sp_statement_completedGO Aside: Altering the session to drop the events is a neat little trick that allows me to stop collection of events while keeping in-memory targets such as the ring buffer available for use. If you stop the session the in-memory target data is lost. Now that we’ve collected some events related to running the stored procedure, we need to do some processing of the data. I’m going to do this in multiple steps using temporary tables so you can see what’s going on; kind of like having to “show your work” on a math test. The first step is to just cast the target data into XML so I can work with it. After that you can pull out the interesting columns, for our purposes I’m going to limit the output to just the event name, object name, stack and sql text. You can see that I’ve don a second CAST, this time of the tsql_stack column, so that I can further process this data. -- Store the XML data to a temp tableSELECT CAST( t.target_data AS XML) xml_dataINTO #xml_event_dataFROM sys.dm_xe_sessions s INNER JOIN sys.dm_xe_session_targets t    ON s.address = t.event_session_addressWHERE s.name = 'track_sprocs' SELECT * FROM #xml_event_data -- Parse the column data out of the XML blockSELECT    event_xml.value('(./@name)', 'varchar(100)') as [event_name],    event_xml.value('(./data[@name="object_name"]/value)[1]', 'varchar(255)') as [object_name],    CAST(event_xml.value('(./action[@name="tsql_stack"]/value)[1]','varchar(MAX)') as XML) as [stack_xml],    event_xml.value('(./action[@name="sql_text"]/value)[1]', 'varchar(max)') as [sql_text]INTO #event_dataFROM #xml_event_data    CROSS APPLY xml_data.nodes('//event') n (event_xml) SELECT * FROM #event_data event_name object_name stack_xml sql_text sp_statement_completed NULL <frame level="1" handle="0x03000500D0057C1403B79600669D00000100000000000000" line="4" offsetStart="94" offsetEnd="172" /><frame level="2" handle="0x01000500CF3F0331B05EC084000000000000000000000000" line="1" offsetStart="0" offsetEnd="-1" /> EXEC sp_multiple_statements sp_statement_completed NULL <frame level="1" handle="0x03000500D0057C1403B79600669D00000100000000000000" line="6" offsetStart="174" offsetEnd="-1" /><frame level="2" handle="0x01000500CF3F0331B05EC084000000000000000000000000" line="1" offsetStart="0" offsetEnd="-1" /> EXEC sp_multiple_statements module_end sp_multiple_statements <frame level="1" handle="0x03000500D0057C1403B79600669D00000100000000000000" line="0" offsetStart="0" offsetEnd="0" /><frame level="2" handle="0x01000500CF3F0331B05EC084000000000000000000000000" line="1" offsetStart="0" offsetEnd="-1" /> EXEC sp_multiple_statements After parsing the columns it’s easier to see what is recorded. You can see that I got back two sp_statement_completed events, which makes sense given the test procedure I’m running, and I got back a single module_end for the entire statement. As described, the sql_text isn’t telling me what I really want to know for the first two events so a little extra effort is required. -- Parse the tsql stack information into columnsSELECT    event_name,    object_name,    frame_xml.value('(./@level)', 'int') as [frame_level],    frame_xml.value('(./@handle)', 'varchar(MAX)') as [sql_handle],    frame_xml.value('(./@offsetStart)', 'int') as [offset_start],    frame_xml.value('(./@offsetEnd)', 'int') as [offset_end]INTO #stack_data    FROM #event_data        CROSS APPLY    stack_xml.nodes('//frame') n (frame_xml)    SELECT * from #stack_data event_name object_name frame_level sql_handle offset_start offset_end sp_statement_completed NULL 1 0x03000500D0057C1403B79600669D00000100000000000000 94 172 sp_statement_completed NULL 2 0x01000500CF3F0331B05EC084000000000000000000000000 0 -1 sp_statement_completed NULL 1 0x03000500D0057C1403B79600669D00000100000000000000 174 -1 sp_statement_completed NULL 2 0x01000500CF3F0331B05EC084000000000000000000000000 0 -1 module_end sp_multiple_statements 1 0x03000500D0057C1403B79600669D00000100000000000000 0 0 module_end sp_multiple_statements 2 0x01000500CF3F0331B05EC084000000000000000000000000 0 -1 Parsing out the stack information doubles the fun and I get two rows for each event. If you examine the stack from the previous table, you can see that each stack has two frames and my query is parsing each event into frames, so this is expected. There is nothing magic about the two frames, that’s just how many I get for this example, it could be fewer or more depending on your statements. The key point here is that I now have a sql_handle and the offset values for those handles, so I can use dm_exec_sql_statement to get the actual statement. Just a reminder, this DMV can only return what is in the cache – if you have old data it’s possible your statements have been ejected from the cache. “Old” is a relative term when talking about caches and can be impacted by server load and how often your statement is actually used. As with most things in life, your mileage may vary. SELECT    qs.*,     SUBSTRING(st.text, (qs.offset_start/2)+1,         ((CASE qs.offset_end          WHEN -1 THEN DATALENGTH(st.text)         ELSE qs.offset_end         END - qs.offset_start)/2) + 1) AS statement_textFROM #stack_data AS qsCROSS APPLY sys.dm_exec_sql_text(CONVERT(varbinary(max),sql_handle,1)) AS st event_name object_name frame_level sql_handle offset_start offset_end statement_text sp_statement_completed NULL 1 0x03000500D0057C1403B79600669D00000100000000000000 94 172 SELECT 'This is the first statement' sp_statement_completed NULL 1 0x03000500D0057C1403B79600669D00000100000000000000 174 -1 SELECT 'this is the second statement' module_end sp_multiple_statements 1 0x03000500D0057C1403B79600669D00000100000000000000 0 0 C Now that looks more like what we were after, the statement_text field is showing the actual statement being run when the sp_statement_completed event occurs. You’ll notice that it’s back down to one row per event, what happened to frame 2? The short answer is, “I don’t know.” In SQL Server 2008 nothing is returned from dm_exec_sql_statement for the second frame and I believe this to be a bug; this behavior has changed in the next major release and I see the actual statement run from the client in frame 2. (In other words I see the same statement that is returned by the sql_text action  or DBCC INPUTBUFFER) There is also something odd going on with frame 1 returned from the module_end event; you can see that the offset values are both 0 and only the first letter of the statement is returned. It seems like the offset_end should actually be –1 in this case and I’m not sure why it’s not returning this correctly. This behavior is being investigated and will hopefully be corrected in the next major version. You can workaround this final oddity by ignoring the offsets and just returning the entire cached statement. SELECT    event_name,    sql_handle,    ts.textFROM #stack_data    CROSS APPLY sys.dm_exec_sql_text(CONVERT(varbinary(max),sql_handle,1)) as ts event_name sql_handle text sp_statement_completed 0x0300070025999F11776BAF006F9D00000100000000000000 CREATE PROCEDURE sp_multiple_statements AS SELECT 'This is the first statement' SELECT 'this is the second statement' sp_statement_completed 0x0300070025999F11776BAF006F9D00000100000000000000 CREATE PROCEDURE sp_multiple_statements AS SELECT 'This is the first statement' SELECT 'this is the second statement' module_end 0x0300070025999F11776BAF006F9D00000100000000000000 CREATE PROCEDURE sp_multiple_statements AS SELECT 'This is the first statement' SELECT 'this is the second statement' Obviously this gives more than you want for the sp_statement_completed events, but it’s the right information for module_end. I leave it to you to determine when this information is needed and use the workaround when appropriate. Aside: You might think it’s odd that I’m showing apparent bugs with my samples, but you’re going to see this behavior if you use this method, so you need to know about it.I’m all about transparency. Happy Eventing- Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Matrix Multiplication with C++ AMP

    - by Daniel Moth
    As part of our API tour of C++ AMP, we looked recently at parallel_for_each. I ended that post by saying we would revisit parallel_for_each after introducing array and array_view. Now is the time, so this is part 2 of parallel_for_each, and also a post that brings together everything we've seen until now. The code for serial and accelerated Consider a naïve (or brute force) serial implementation of matrix multiplication  0: void MatrixMultiplySerial(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 1: { 2: for (int row = 0; row < M; row++) 3: { 4: for (int col = 0; col < N; col++) 5: { 6: float sum = 0.0f; 7: for(int i = 0; i < W; i++) 8: sum += vA[row * W + i] * vB[i * N + col]; 9: vC[row * N + col] = sum; 10: } 11: } 12: } We notice that each loop iteration is independent from each other and so can be parallelized. If in addition we have really large amounts of data, then this is a good candidate to offload to an accelerator. First, I'll just show you an example of what that code may look like with C++ AMP, and then we'll analyze it. It is assumed that you included at the top of your file #include <amp.h> 13: void MatrixMultiplySimple(std::vector<float>& vC, const std::vector<float>& vA, const std::vector<float>& vB, int M, int N, int W) 14: { 15: concurrency::array_view<const float,2> a(M, W, vA); 16: concurrency::array_view<const float,2> b(W, N, vB); 17: concurrency::array_view<concurrency::writeonly<float>,2> c(M, N, vC); 18: concurrency::parallel_for_each(c.grid, 19: [=](concurrency::index<2> idx) restrict(direct3d) { 20: int row = idx[0]; int col = idx[1]; 21: float sum = 0.0f; 22: for(int i = 0; i < W; i++) 23: sum += a(row, i) * b(i, col); 24: c[idx] = sum; 25: }); 26: } First a visual comparison, just for fun: The beginning and end is the same, i.e. lines 0,1,12 are identical to lines 13,14,26. The double nested loop (lines 2,3,4,5 and 10,11) has been transformed into a parallel_for_each call (18,19,20 and 25). The core algorithm (lines 6,7,8,9) is essentially the same (lines 21,22,23,24). We have extra lines in the C++ AMP version (15,16,17). Now let's dig in deeper. Using array_view and extent When we decided to convert this function to run on an accelerator, we knew we couldn't use the std::vector objects in the restrict(direct3d) function. So we had a choice of copying the data to the the concurrency::array<T,N> object, or wrapping the vector container (and hence its data) with a concurrency::array_view<T,N> object from amp.h – here we used the latter (lines 15,16,17). Now we can access the same data through the array_view objects (a and b) instead of the vector objects (vA and vB), and the added benefit is that we can capture the array_view objects in the lambda (lines 19-25) that we pass to the parallel_for_each call (line 18) and the data will get copied on demand for us to the accelerator. Note that line 15 (and ditto for 16 and 17) could have been written as two lines instead of one: extent<2> e(M, W); array_view<const float, 2> a(e, vA); In other words, we could have explicitly created the extent object instead of letting the array_view create it for us under the covers through the constructor overload we chose. The benefit of the extent object in this instance is that we can express that the data is indeed two dimensional, i.e a matrix. When we were using a vector object we could not do that, and instead we had to track via additional unrelated variables the dimensions of the matrix (i.e. with the integers M and W) – aren't you loving C++ AMP already? Note that the const before the float when creating a and b, will result in the underling data only being copied to the accelerator and not be copied back – a nice optimization. A similar thing is happening on line 17 when creating array_view c, where we have indicated that we do not need to copy the data to the accelerator, only copy it back. The kernel dispatch On line 18 we make the call to the C++ AMP entry point (parallel_for_each) to invoke our parallel loop or, as some may say, dispatch our kernel. The first argument we need to pass describes how many threads we want for this computation. For this algorithm we decided that we want exactly the same number of threads as the number of elements in the output matrix, i.e. in array_view c which will eventually update the vector vC. So each thread will compute exactly one result. Since the elements in c are organized in a 2-dimensional manner we can organize our threads in a two-dimensional manner too. We don't have to think too much about how to create the first argument (a grid) since the array_view object helpfully exposes that as a property. Note that instead of c.grid we could have written grid<2>(c.extent) or grid<2>(extent<2>(M, N)) – the result is the same in that we have specified M*N threads to execute our lambda. The second argument is a restrict(direct3d) lambda that accepts an index object. Since we elected to use a two-dimensional extent as the first argument of parallel_for_each, the index will also be two-dimensional and as covered in the previous posts it represents the thread ID, which in our case maps perfectly to the index of each element in the resulting array_view. The kernel itself The lambda body (lines 20-24), or as some may say, the kernel, is the code that will actually execute on the accelerator. It will be called by M*N threads and we can use those threads to index into the two input array_views (a,b) and write results into the output array_view ( c ). The four lines (21-24) are essentially identical to the four lines of the serial algorithm (6-9). The only difference is how we index into a,b,c versus how we index into vA,vB,vC. The code we wrote with C++ AMP is much nicer in its indexing, because the dimensionality is a first class concept, so you don't have to do funny arithmetic calculating the index of where the next row starts, which you have to do when working with vectors directly (since they store all the data in a flat manner). I skipped over describing line 20. Note that we didn't really need to read the two components of the index into temporary local variables. This mostly reflects my personal choice, in some algorithms to break down the index into local variables with names that make sense for the algorithm, i.e. in this case row and col. In other cases it may i,j,k or x,y,z, or M,N or whatever. Also note that we could have written line 24 as: c(idx[0], idx[1])=sum  or  c(row, col)=sum instead of the simpler c[idx]=sum Targeting a specific accelerator Imagine that we had more than one hardware accelerator on a system and we wanted to pick a specific one to execute this parallel loop on. So there would be some code like this anywhere before line 18: vector<accelerator> accs = MyFunctionThatChoosesSuitableAccelerators(); accelerator acc = accs[0]; …and then we would modify line 18 so we would be calling another overload of parallel_for_each that accepts an accelerator_view as the first argument, so it would become: concurrency::parallel_for_each(acc.default_view, c.grid, ...and the rest of your code remains the same… how simple is that? Comments about this post by Daniel Moth welcome at the original blog.

    Read the article

  • Project Jigsaw: Late for the train: The Q&A

    - by Mark Reinhold
    I recently proposed, to the Java community in general and to the SE 8 (JSR 337) Expert Group in particular, to defer Project Jigsaw from Java 8 to Java 9. I also proposed to aim explicitly for a regular two-year release cycle going forward. Herewith a summary of the key questions I’ve seen in reaction to these proposals, along with answers. Making the decision Q Has the Java SE 8 Expert Group decided whether to defer the addition of a module system and the modularization of the Platform to Java SE 9? A No, it has not yet decided. Q By when do you expect the EG to make this decision? A In the next month or so. Q How can I make sure my voice is heard? A The EG will consider all relevant input from the wider community. If you have a prominent blog, column, or other communication channel then there’s a good chance that we’ve already seen your opinion. If not, you’re welcome to send it to the Java SE 8 Comments List, which is the EG’s official feedback channel. Q What’s the overall tone of the feedback you’ve received? A The feedback has been about evenly divided as to whether Java 8 should be delayed for Jigsaw, Jigsaw should be deferred to Java 9, or some other, usually less-realistic, option should be taken. Project Jigsaw Q Why is Project Jigsaw taking so long? A Project Jigsaw started at Sun, way back in August 2008. Like many efforts during the final years of Sun, it was not well staffed. Jigsaw initially ran on a shoestring, with just a handful of mostly part-time engineers, so progress was slow. During the integration of Sun into Oracle all work on Jigsaw was halted for a time, but it was eventually resumed after a thorough consideration of the alternatives. Project Jigsaw was really only fully staffed about a year ago, around the time that Java 7 shipped. We’ve added a few more engineers to the team since then, but that can’t make up for the inadequate initial staffing and the time lost during the transition. Q So it’s really just a matter of staffing limitations and corporate-integration distractions? A Aside from these difficulties, the other main factor in the duration of the project is the sheer technical difficulty of modularizing the JDK. Q Why is modularizing the JDK so hard? A There are two main reasons. The first is that the JDK code base is deeply interconnected at both the API and the implementation levels, having been built over many years primarily in the style of a monolithic software system. We’ve spent considerable effort eliminating or at least simplifying as many API and implementation dependences as possible, so that both the Platform and its implementations can be presented as a coherent set of interdependent modules, but some particularly thorny cases remain. Q What’s the second reason? A We want to maintain as much compatibility with prior releases as possible, most especially for existing classpath-based applications but also, to the extent feasible, for applications composed of modules. Q Is modularizing the JDK even necessary? Can’t you just put it in one big module? A Modularizing the JDK, and more specifically modularizing the Java SE Platform, will enable standard yet flexible Java runtime configurations scaling from large servers down to small embedded devices. In the long term it will enable the convergence of Java SE with the higher-end Java ME Platforms. Q Is Project Jigsaw just about modularizing the JDK? A As originally conceived, Project Jigsaw was indeed focused primarily upon modularizing the JDK. The growing demand for a truly standard module system for the Java Platform, which could be used not just for the Platform itself but also for libraries and applications built on top of it, later motivated expanding the scope of the effort. Q As a developer, why should I care about Project Jigsaw? A The introduction of a modular Java Platform will, in the long term, fundamentally change the way that Java implementations, libraries, frameworks, tools, and applications are designed, built, and deployed. Q How much progress has Project Jigsaw made? A We’ve actually made a lot of progress. Much of the core functionality of the module system has been prototyped and works at both compile time and run time. We’ve extended the Java programming language with module declarations, worked out a structure for modular source trees and corresponding compiled-class trees, and implemented these features in javac. We’ve defined an efficient module-file format, extended the JVM to bootstrap a modular JRE, and designed and implemented a preliminary API. We’ve used the module system to make a good first cut at dividing the JDK and the Java SE API into a coherent set of modules. Among other things, we’re currently working to retrofit the java.util.ServiceLoader API to support modular services. Q I want to help! How can I get involved? A Check out the project page, read the draft requirements and design overview documents, download the latest prototype build, and play with it. You can tell us what you think, and follow the rest of our work in real time, on the jigsaw-dev list. The Java Platform Module System JSR Q What’s the relationship between Project Jigsaw and the eventual Java Platform Module System JSR? A At a high level, Project Jigsaw has two phases. In the first phase we’re exploring an approach to modularity that’s markedly different from that of existing Java modularity solutions. We’ve assumed that we can change the Java programming language, the virtual machine, and the APIs. Doing so enables a design which can strongly enforce module boundaries in all program phases, from compilation to deployment to execution. That, in turn, leads to better usability, diagnosability, security, and performance. The ultimate goal of the first phase is produce a working prototype which can inform the work of the Module-System JSR EG. Q What will happen in the second phase of Project Jigsaw? A The second phase will produce the reference implementation of the specification created by the Module-System JSR EG. The EG might ultimately choose an entirely different approach than the one we’re exploring now. If and when that happens then Project Jigsaw will change course as necessary, but either way I think that the end result will be better for having been informed by our current work. Maven & OSGi Q Why not just use Maven? A Maven is a software project management and comprehension tool. As such it can be seen as a kind of build-time module system but, by its nature, it does nothing to support modularity at run time. Q Why not just adopt OSGi? A OSGi is a rich dynamic component system which includes not just a module system but also a life-cycle model and a dynamic service registry. The latter two facilities are useful to some kinds of sophisticated applications, but I don’t think they’re of wide enough interest to be standardized as part of the Java SE Platform. Q Okay, then why not just adopt the module layer of OSGi? A The OSGi module layer is not operative at compile time; it only addresses modularity during packaging, deployment, and execution. As it stands, moreover, it’s useful for library and application modules but, since it’s built strictly on top of the Java SE Platform, it can’t be used to modularize the Platform itself. Q If Maven addresses modularity at build time, and the OSGi module layer addresses modularity during deployment and at run time, then why not just use the two together, as many developers already do? A The combination of Maven and OSGi is certainly very useful in practice today. These systems have, however, been built on top of the existing Java platform; they have not been able to change the platform itself. This means, among other things, that module boundaries are weakly enforced, if at all, which makes it difficult to diagnose configuration errors and impossible to run untrusted code securely. The prototype Jigsaw module system, by contrast, aims to define a platform-level solution which extends both the language and the JVM in order to enforce module boundaries strongly and uniformly in all program phases. Q If the EG chooses an approach like the one currently being taken in the Jigsaw prototype, will Maven and OSGi be made obsolete? A No, not at all! No matter what approach is taken, to ensure wide adoption it’s essential that the standard Java Platform Module System interact well with Maven. Applications that depend upon the sophisticated features of OSGi will no doubt continue to use OSGi, so it’s critical that implementations of OSGi be able to run on top of the Java module system and, if suitably modified, support OSGi bundles that depend upon Java modules. Ideas for how to do that are currently being explored in Project Penrose. Java 8 & Java 9 Q Without Jigsaw, won’t Java 8 be a pretty boring release? A No, far from it! It’s still slated to include the widely-anticipated Project Lambda (JSR 335), work on which has been going very well, along with the new Date/Time API (JSR 310), Type Annotations (JSR 308), and a set of smaller features already in progress. Q Won’t deferring Jigsaw to Java 9 delay the eventual convergence of the higher-end Java ME Platforms with Java SE? A It will slow that transition, but it will not stop it. To allow progress toward that convergence to be made with Java 8 I’ve suggested to the Java SE 8 EG that we consider specifying a small number of Profiles which would allow compact configurations of the SE Platform to be built and deployed. Q If Jigsaw is deferred to Java 9, would the Oracle engineers currently working on it be reassigned to other Java 8 features and then return to working on Jigsaw again after Java 8 ships? A No, these engineers would continue to work primarily on Jigsaw from now until Java 9 ships. Q Why not drop Lambda and finish Jigsaw instead? A Even if the engineers currently working on Lambda could instantly switch over to Jigsaw and immediately become productive—which of course they can’t—there are less than nine months remaining in the Java 8 schedule for work on major features. That’s just not enough time for the broad review, testing, and feedback which such a fundamental change to the Java Platform requires. Q Why not ship the module system in Java 8, and then modularize the platform in Java 9? A If we deliver a module system in one release but don’t use it to modularize the JDK until some later release then we run a big risk of getting something fundamentally wrong. If that happens then we’d have to fix it in the later release, and fixing fundamental design flaws after the fact almost always leads to a poor end result. Q Why not ship Jigsaw in an 8.5 release, less than two years after 8? Or why not just ship a new release every year, rather than every other year? A Many more developers work on the JDK today than a couple of years ago, both because Oracle has dramatically increased its own investment and because other organizations and individuals have joined the OpenJDK Community. Collectively we don’t, however, have the bandwidth required to ship and then provide long-term support for a big JDK release more frequently than about every other year. Q What’s the feedback been on the two-year release-cycle proposal? A For just about every comment that we should release more frequently, so that new features are available sooner, there’s been another asking for an even slower release cycle so that large teams of enterprise developers who ship mission-critical applications have a chance to migrate at a comfortable pace.

    Read the article

  • How to diagnose computer lockup/freezing problem

    - by Scott Mitchell
    I built a desktop computer a couple years back with the following specs: CPU: Intel Core 2 Quad Q9300 Yorkfield 2.5GHz 6MB L2 Cache LGA 775 95W Quad-Core Processor BX80580Q9300 Motherboard: EVGA 122-CK-NF68-T1 LGA 775 NVIDIA nForce 680i SLI ATX Intel Motherboard Video Card: Two EVGA 256-P2-N758-TR GeForce 8600GT SCC 256MB 128-bit GDDR3 PCI Express x16 SLI Supported Video Card PSU: SeaSonic S12 Energy Plus SS-550HT 550W ATX12V V2.3 / EPS12V V2.91 SLI Certified CrossFire Ready 80 PLUS Certified Active PFC Power Supply Memory: Two G.SKILL 4GB (2 x 2GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual Channel Kit Desktop Memory Model F2-6400CL5D-4GBPQ Since its inception, the machine has periodically locked up, the regularlity having varied over the years from once a day to once a month. Typically, lockups happen once every few days. By "lockup" I mean my computer just freezes. The screen locks up, I can't move the mouse. Hitting keys on my keyboard that normally turn LEDs on or off on the keyboard (such as Caps Lock) no longer turn the LEDs on or off. If there was music playing at the time of the lockup, noise keeps coming out of the speakers, but it's just the current frequency/note that plays indefinitely. There is no BSOD. When such a lockup occurs I have to do a hard reboot by either turning off the computer or hitting the reset button. I have the most recent version of the NVIDIA hardware drivers, and update them semi-regularly, but that hasn't seemed to help. I am currently using Windows 7 x64, but was previously using Windows Server 2003 x64 and having the same lockup issues. My guess is that it's somehow video driver or motherboard related, but I don't know how to go about diagnosing this problem to narrow down which of the two is the culprit. Additional information re: cooling Regarding cooling... I've not installed any after-market cooling systems aside from two regular fans I scavenged from an older computer. The fan atop the CPU is the one that shipped with it. One of the two scavenged fans I added it located at the bottom tower of the corner, in an attempt to create some airflow from front to back. The second fan is pointed directly at the two video cards. SpeedFan installation and readings Per studiohack's suggestion, I installed SpeedFan, which provided the following temperature readings: GPU: 63C GPU: 65C System: 76C CPU: 64C AUX: 36C Core 0: 78C Core 1: 76C Core 2: 79C Core 3: 79C Update #3: Another Lockup :-( Well, I had another lockup last night. :-( SpeedFan reported the CPU temp at 38 C when it happened, and there was no spike in temperature leading up to the freeze. One thing I notice is that the freeze seems more likely to happen if I am watching a video. In fact, of the last 5 freezes over the past month, 4 of them have been while watching a video on Flickr. Not necessarily the same video, but a video nevertheless. I don't know if this is just coincidence or if it means anything. (As an aside, each night before bedtime my 2 year old daughter sits on my lap and watches some home videos on Flickr and, in the last month, has learned the phrase, "Uh oh, computer broke.") Update #4: MemTest86 and 3DMark06 Test Results: Per suggestions in the comments, I ran the MemTest86 overnight and it cycled through the 8 GB of memory 5 times without error. I also ran the 3DMark06 test without a problem (see my scores at http://3dmark.com/3dm06/15163549). So... what now? :-) Any further suggestions on what to check? Is there some way to get a stack trace or something when the computer locks like that? Thanks

    Read the article

  • Duplicate DNS Zones (Error 4515 in Event Log )

    - by Campo
    I am getting these two error in the DNS Event log (errors at end of question). I have confirmed I do have duplicate zones. I am wondering which ones to delete. The DomainDNSZone contains all of our DNS records but it does not have the _msdcs zone.... that is in the ForestDNSZone with the duplicates that are not in use. here is a picture of that 3 Questions. I understand the advantages of having DNS in the ForestDNSZone. so... Why is DNS using the DomainDNSZone and is that acceptable considering _msdcs... is in the ForestDNSZone? If so, should I just delete the DC=1.168.192.in-addr.arpa and DC=supernova.local from the ForestDNSZone? Or should I try to get those to be the ones in use? What are those steps? I understand how to delete. That is simple but if i must move zones some info would be appreaciated there. Just to confirm. from my understanding. I can delete the two duplicates in the ForestDNSZone and leave the _msdcs.supernova.local as thats required there. This will resolve the erros I see. Just fyi when I look in those folders from the ForestDNSZone they have just 2 and 1 entries respectively. So obviously not in use compared to the others. I am pretty sure I understand the steps to complete this. But if you would like to provide that info, bonus points! Event Type: Warning Event Source: DNS Event Category: None Event ID: 4515 Date: 1/4/2011 Time: 2:14:18 PM User: N/A Computer: STANLEY Description: The zone 1.168.192.in-addr.arpa was previously loaded from the directory partition DomainDnsZones.supernova.local but another copy of the zone has been found in directory partition ForestDnsZones.supernova.local. The DNS Server will ignore this new copy of the zone. Please resolve this conflict as soon as possible. If an administrator has moved this zone from one directory partition to another this may be a harmless transient condition. In this case, no action is necessary. The deletion of the original copy of the zone should soon replicate to this server. If there are two copies of this zone in two different directory partitions but this is not a transient caused by a zone move operation then one of these copies should be deleted as soon as possible to resolve this conflict. To change the replication scope of an application directory partition containing DNS zones and for more details on storing DNS zones in the application directory partitions, please see Help and Support. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. Data: 0000: 89 25 00 00 %.. AND Event Type: Warning Event Source: DNS Event Category: None Event ID: 4515 Date: 1/4/2011 Time: 2:14:18 PM User: N/A Computer: STANLEY Description: The zone supernova.local was previously loaded from the directory partition DomainDnsZones.supernova.local but another copy of the zone has been found in directory partition ForestDnsZones.supernova.local. The DNS Server will ignore this new copy of the zone. Please resolve this conflict as soon as possible. If an administrator has moved this zone from one directory partition to another this may be a harmless transient condition. In this case, no action is necessary. The deletion of the original copy of the zone should soon replicate to this server. If there are two copies of this zone in two different directory partitions but this is not a transient caused by a zone move operation then one of these copies should be deleted as soon as possible to resolve this conflict. To change the replication scope of an application directory partition containing DNS zones and for more details on storing DNS zones in the application directory partitions, please see Help and Support. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. Data: 0000: 89 25 00 00 %..

    Read the article

  • Add a small RAID card? Will it help overall stability and performance of my nine hard drives?

    - by Ray
    Hi, Will I get any extra genuine added performance and RAID stability if I insert a basic RAID card into a PCI-E x1 slot? I am considering the Adaptec 1220SA - 2 port SATA , pci-express (1x) , raid 0/1. Ok it only supports two SATA drives. Purpose is to help support the eight internal hard drives (1TB each), a DVD drive and an external e-SATA connected 2TB hard drive - by dealing with two of the internal hard drives. My current configuration of eight internal 1TB Barracuda (7200.12) SATA hard drives, one external 2TB SATA Western Digital Green Drive (e-SATA) and one DVD drive can already be supported by the Intel P55 & JMicron controllers on the ASUS motherboard : the Intel P55 (controls six HDD; configured as three x RAID 1), and the JMicron (controls two HDD as one RAID 1, as well as the DVD drive and the external SATA drive via the motherboard's e-SATA port (controlled by the JMicron)). Bigger picture details : I have an ASUS motherboard designed for the LGA1156 type processor and it includes the Intel P55 Express Chipset and JMicron. I am using the Intel Core i7-870 processor, and have 8GB DDR3 (1333) memory (four x 2GB Corsair DIMMs). Enough overall power. The power supply is more than sufficicient for the system. Corsair AX850. The system will never need the full 850 watts (future : second graphics card). The RAID card would provide hardware RAID 1 for two of the eight intrnal drives. It would either reduce the load on : the Intel P55 firmware RAID support, or replace the JMicron controller's RAID 1 set. I am busy installing the above configuration using Windows 7 Ultimate 64-bit as the OS. The RAID card is a last minute addition to the plan. Is it worth spending the extra R700 - R900 on the Adaptec 1220SA, or equivalent RAID card? I cannot afford to spend yet another R2000 - R3000 on a RAID card that would support many SATA2 hard drives, with a better RAID, example the RAID 5. My Issue & assumption : I am trusting that the Intel P55 chipset can properly handle six drives, configured as three * RAID 1. I am assuming that the JMicron can handle, using its RED SATA ports, one RAID-1 (two HDDs). The DVD drive connects to the JMicron optical SATA port 1 (white port 1). White port 2 is not used. The e-SATA connection is from the JMicron straight to, and through the motherboard - to an on-board (rear panel) e-SATA port. Am I being a little hopeful in only using the on-board Intel P55 and the JMicron? Is it a waste of money to install a RAID card that handles two SATA2 drives? OR Is it wisdom to take the pressure a little off the Intel P55? Obviously I am interested in data security, hence RAID 1, not RAID Zero. RAID 5 would be nice. The CPU, Intel Core i7-870 will provide the clout. Context to nine drives : I am using virtualisation with Windows 7 Ultimate. Bootable VMs. The operating system gets a mirror. Loaded apps gets a mirror. The current design data is kept in another mirror and Another mirror is back-up one and / or VM territory. Then the external 2TB drive (via e-SATA) is the next layer of data security and then finally, I use off-site data security. Thanks.

    Read the article

  • Exchange Mail Flow

    - by Tuck918
    Hello. I have a question. We have one Exchange 2003 server and two Exchange 2007 servers. Most all of our mailboxes are on 2007 but we do still have one shared mailbox, unity mailbox and a journling mailbox on 2003. Public Folders have been set to replicate to 2007. I have set up a send connector on 2007 with a cost of 1. Receive connectors have Anonymous Users checked on 2007. On 2003 there are two connectors: the Internet Email connector and the connector that connects 2003 to 2007. We have a SPAM filtering device that email goes through before it is handed off to Exchange. The SPAM filtering device is set to send email to one of our Exchange 2007 servers. Here is my question/problem: Even though the SPAM filtering device is set to forward email to Exchange 2007, somehow all of our email is still going through the Exchange 2003 server before it finally hits the users mailboxes on the Exchange 2007 server. How can I change it so that all email goes directly to Exchange 2007 and never routes through Excahnge 2003 both ways, inbound and outbound? Would also like to add: In the EMC under Org- Hub- Send Connector there are two connectors. One is the "Internet Connector" from the 2003 box and the other is the new one I created. THe address space on the 2003 one is set to a cost of 2, no smart hosts and the 2003 box is listed as the Source Server. THe other Send Connector has an address space of 1, no smart host and has the 2 excahnge 2007 servers listed as the source servers. In EMC under Server- Hub- my two exchange 2007 servers are listed. Each one has 2 receive connectors. Both Recieve Connectors are setup the same way. THe Default Receive Connector has Anonymous Users checked. The other Recieve Connector is labled "Client" and I am not sure what it does or why its there. Anonymous Users are not checked. No smart hosts configured on 2003. Additional details Currently we have 3 excahnge servers. One exchange 2003 server and two excahnge 2007 servers. THe exchange 2003 server is the acting "bridgehead" serverand all email is routing through this server, inbound and outbound. We are wanting to decommission this server and use our two exchange 2007 servers as our mailbox servers. All of of user mailboxes are already on one of the exchange 2007 boxes and we want to put whats left on the exchange 2003 box on our other excahnge 2007 box. Both excahnge 2007 servers are currently CAS, HT and MB servers. We have a SPAM filtering device that sits between our excahnge servers and the firewall and have it configured to send messages to one of the excahgne 2007 servers but when we look at the message headers we can see that messgaes are still being routed to the excahnge 2003 box. We want to bypass the exchange 2003 in the routing process as it is dying and is starting to have major issues so everytime it goes down our email is down. Is there possible some sort of AD routing link/site link stuff going on?

    Read the article

  • How to diagnose computer lockups and freezes?

    - by Scott Mitchell
    I built a desktop computer a couple years back with the following specs: CPU: Intel Core 2 Quad Q9300 Yorkfield 2.5GHz 6 MB L2 Cache LGA 775 95W Quad-Core Processor BX80580Q9300 Motherboard: EVGA 122-CK-NF68-T1 LGA 775 NVIDIA nForce 680i SLI ATX Intel Motherboard Video Card: Two EVGA 256-P2-N758-TR GeForce 8600GT SCC 256 MB 128-bit GDDR3 PCI Express x16 SLI Supported Video Card PSU: SeaSonic S12 Energy Plus SS-550HT 550W ATX12V V2.3 / EPS12V V2.91 SLI Certified CrossFire Ready 80 PLUS Certified Active PFC Power Supply Memory: Two G.SKILL 4 GB (2 x 2 GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual Channel Kit Desktop Memory Model F2-6400CL5D-4GBPQ Since its inception, the machine has periodically locked up, the regularity having varied over the years from once a day to once a month. Typically, lockups happen once every few days. By "lockup" I mean my computer just freezes. The screen locks up, I can't move the mouse. Hitting keys on my keyboard that normally turn LEDs on or off on the keyboard (such as Caps Lock) no longer turn the LEDs on or off. If there was music playing at the time of the lockup, noise keeps coming out of the speakers, but it's just the current frequency/note that plays indefinitely. There is no BSOD. When such a lockup occurs I have to do a hard reboot by either turning off the computer or hitting the reset button. I have the most recent version of the NVIDIA hardware drivers, and update them semi-regularly, but that hasn't seemed to help. I am currently using Windows 7 x64, but was previously using Windows Server 2003 x64 and having the same lockup issues. My guess is that it's somehow video driver or motherboard related, but I don't know how to go about diagnosing this problem to narrow down which of the two is the culprit. Additional information re: cooling Regarding cooling... I've not installed any after-market cooling systems aside from two regular fans I scavenged from an older computer. The fan atop the CPU is the one that shipped with it. One of the two scavenged fans I added it located at the bottom tower of the corner, in an attempt to create some airflow from front to back. The second fan is pointed directly at the two video cards. SpeedFan installation and readings Per studiohack's suggestion, I installed SpeedFan, which provided the following temperature readings: GPU: 63C GPU: 65C System: 76C CPU: 64C AUX: 36C Core 0: 78C Core 1: 76C Core 2: 79C Core 3: 79C Update #3: Another Lockup :-( Well, I had another lockup last night. :-( SpeedFan reported the CPU temp at 38 C when it happened, and there was no spike in temperature leading up to the freeze. One thing I notice is that the freeze seems more likely to happen if I am watching a video. In fact, of the last 5 freezes over the past month, 4 of them have been while watching a video on Flickr. Not necessarily the same video, but a video nevertheless. I don't know if this is just coincidence or if it means anything. (As an aside, each night before bedtime my 2 year old daughter sits on my lap and watches some home videos on Flickr and, in the last month, has learned the phrase, "Uh oh, computer broke.") Update #4: MemTest86 and 3DMark06 Test Results: Per suggestions in the comments, I ran the MemTest86 overnight and it cycled through the 8 GB of memory 5 times without error. I also ran the 3DMark06 test without a problem (see my scores at http://3dmark.com/3dm06/15163549). So... what now? :-) Any further suggestions on what to check? Is there some way to get a stack trace or something when the computer locks like that? Resolution I have never did figure out the particular problems, but based on the suggestions here and elsewhere, I'm presuming it was a motherboard issue. In any event, I recently upgraded my system, buying a new motherbeard, PSU, CPU, and RAM, and that new rig has been working splendidly the past several weeks. I am using the same graphic cards as in the old setup, so I think it's safe to reason that they weren't the cause of the problem.

    Read the article

  • WNDR3700 Router + Cisco SG200-08 + LACP + Dual Uplink

    - by kobaltz
    Background I have a storage server that has several virtual machine images stored on them. I would store them locally, but I have limited space on my desktop (using SSD storage). I would like to increase the bandwidth between the desktop and the storage server by using two NICs on each computer. My original configuration allowed about 55MBps between the desktop and storage server. This storage server also has several TBs of documents, pictures, movies, vms, and ISO/programs. The storage server has 8 1.5TB hard drives in a RAID 10 configuration with a hardware RAID controller. The benchmarks on the RAID 10 are about 300MBps. Configuration In short, I am trying to bridge my switch and router. The switch is a small 8 port Cisco smart switch that supports 802.3ad LACP. I have two computers plugged into the switch, each with 2 Intel Gigabit NICs. The first computer is a Windows 7 machine that has the Intel ANS software installed. I have LACP configured with the computer and now show 3 NICs (2 Physical + 1 TEAM Virtual @ 2Gbps). It looks like this computer is configured correctly. I trunked the two ports that this computer is plugged into with the switch's web interface. The second computer is a homebrew storage box running debian. I also have the bonding enabled on this machine and the switch configured with LACP. Without having the WNDR3700 router in the picture yet, I am able to communicate between the Windows 7 machine and the debian box since they both have static IP addresses. With LACP enabled on both machines I am getting about 106-108MBps speeds. Issue I plug in a network cable from the switch into the router and enable DHCP on the desktop. I saw no need to have a static address on the desktop. My transfer rates are still from 106MBps-108MBps. While this is still a boost, I am trying to figure out how to get about 140-180MBps. I am thinking that I need to increase the bandwidth from the router to the switch. My switch allows 4 groups for port trunking. I plugged in a second network cable from the router to the switch. My question is, what is the proper way to fix this issue. Should I port trunk the two ports that are going from the switch to the router? Keep in mind that the router is a WNDR3700 and is unsure whether or not it supports LACP. I do have OpenWRT installed on the router, but it still wasn't clear in any documentation that I found if it supported 802.3ad LACP standards. I am also wondering if there needs to be anything changed within the Cisco settings. [Edit] - Corrected some numbers, wasn't really paying attention. It looks like the speeds though at least two NICs are bonded with LACP is still reaching the max bandwidth of one port. Is there a way to configure the switch so that I can increase this bandwidth? Also, on the storage server, I had a couple of extra NICs laying around and threw them on there as well. Another EDIT and More Findings I happened to look at the traffic of each individual NIC and think that I see the problem. I tested with a simple transfer for a 4GB file. I noticed that only one of the NICs was taking the load of the traffic. I then copied the file back to the Storage Server and noticed that the other NIC was sending out the traffic. I have 802.3ad LACP enabled on the two NICs and I see that it gets enabled dynamically on the switch's interface. Should I be using Static Link Aggregation?

    Read the article

  • MD3200 - 3 to 4x less throughput than MD1220. Am I missing something here?

    - by Igor Polishchuk
    I have two R710 servers with similar configuration. One in my office has MD1220 attached. Another one in the datacenter of my hosting services vendor has MD3200. I'm getting significantly worse throughput from MD3200 at my vendors setup. I'm mostly interested in sequential writes, and I'm getting these results in bonnie++ and dd tests: Seq. writes on MD1220 in my office: 1.1 GB/s - bonnie++, 1.3GB/s - dd Seq. writes on MD3200 at my vendor's: 240MB/s - bonnie++, 310MB/s - dd Unfortunately, I could not test the exactly the same configurations, but the two I have should be comparable. If anything, my good performing environment is cheaper than the bad performing. I expect at least similar throughput from these two setups. My vendor cannot really help me. Hopefully, somebody more familiar with the DAS performance can look at it and tell if I'm missing something here and my expectations are too high. To summarize, the question here is it reasonable to expect about 100MB/s of sequential write throughput per each couple of drives in RAID10 on MD3200? Is there any trick to enable such performance in MD3200 with dual controller as opposed to simple MD1220 with a single H800 adapter? More details about the configurations: A good one in my office: Dell R710 2CPU X5650 @ 2.67GHz 12 cores 96GB DDR3, OS: RHEL 5.5, kernel 2.6.18-194.26.1.el5 x86_64 20x300GB 2.5" SAS 10K in a single RAID10 1MB chunk size on MD1220 + Dell H800 I/O controller with 1GB cache in the host Not so good one at my vendor's: Dell R710 2CPU L5520 @ 2.27GHz 8 cores 144GB DDR3, OS: RHEL 5.5, kernel 2.6.18-194.11.4.el5 x86_64 20x146GB 2.5" SAS 15K in a single RAID10 512KB chunk size, Dell MD3200, 2 I/O controllers in array with 1GB cache each Additional information. I've also ran the same tests on the same vendor's host, but the storage was: two raids of 14x146GB 15K RPM drives RAID 10, striped together on the OS level on MD3000+MD1000. The performance was about 25% worse than on MD3200 despite having more drives. When I ran similar tests on the internal storage of my vendor's host (2x146GB 15K RPM drives RAID1, Perc 6i) I've got about 128MB/s seq. writes. Just two internal drives gave me about a half of 20 drives' throughput on MD3200. The random I/O performance of the MD3200 setup is ok, it gives me at least 1300 IOPS. I'm mostly have problems with sequentioal I/O throughput. Thank you for looking into it. Regards Igor

    Read the article

  • Outlook Anywhere inconsistencies with authentication methods

    - by gravyface
    So I've read this question and attempted just about every other workaround I've found online. Problem seems completely illogical to me, anyways: SBS 2011, vanilla install; haven't touched anything in IIS or Exchange outside of what's been done through the checklist (brand new domain, completely new customer) except to import an existing wildcard certificate for *.example.com (which is valid, Remote Web Workplace and Outlook Web Access work fine). On the two test machines and one production machine running a mixture of Windows XP Pro, Windows 7 and Outlook 2003 through to 2010, I've had no problem saving the password after configuring Outlook Anywhere using the wrong authentication method. I repeat, I have had no issues using the wrong authentication method on these test machines; password saves the first time, no problem, can verify it exists in the credentials manager (Start Run control userpasswords2), close Outlook, reboot, go make a sammie, come back, credentials are still saved. When I say wrong, it's because I was choosing NTLM and Exchange (under Exchange Console Server Configuration Client Access) was set by default to use Basic. On two completely different machines setup by a co-worker, they had (under my guidance) used NTLM as well... except that frustratingly, Outlook would always ask for a password. One machine was Windows XP with Outlook 2010, the other was Windows 7 with Outlook 2003. When these two machines were set to use Basic -- the correct settings -- the option to save was there and now works without issue. Puzzled by how my machines could possibly work with the wrong authentication, I then went into one of them and changed the authentication method to Basic. Now here's where it gets a little crazy: if I go under Outlook and change the authentication to use the correct setting (Basic) it fails to save the password and Outlook prompts every time (without a "remember me" checkbox). I have not had a chance to change it to Basic on the other two machines to see if this is just a fluke or not, but something just isn't right here. My two hunches are either a missing/installed KB Update or perhaps a local security policy. I should add that none of the 5 test machines in the equation here have ever been joined to the domain.

    Read the article

  • recommendations for efficient offsite remote backup solution of vm's

    - by senorsmile
    I am looking for recommendations for backing up my current 6 vm's(and soon to grow to up to 20). Currently I am running a two node proxmox cluster(which is a debian base using kvm for virtualization with a custom web front end to administer). I have two nearly identical boxes with amd phenom II x4's and asus motherboards. Each has 4 500 GB sata2 hdd's, 1 for the os and other data for the proxmox install, and 3 using mdadm+drbd+lvm to share the 1.5 TB's of storage between the two machines. I mount lvm images to kvm for all of the virtual machines. I currently have the ability to do live transfer from one machine to the other, typically within seconds(it takes about 2 minutes on the largest vm running win2008 with m$ sql server). I am using proxmox's built-in vzdump utility to take snapshots of the vm's and store those on an external harddrive on the network. I then have jungledisk service (using rackspace) to sync the vzdump folder for remote offsite backup. This is all fine and dandy, but it's not very scalable. For one, the backups themselves can take up to a few hours every night. With jungledisk's block level incremental transfers, the sync only transfers a small portion of the data offsite, but that still takes at least a half an hour. The much better solution would of course be something that allows me to instantly take the difference of two time points (say what was written from 6am to 7am), zip it, then send that difference file to the backup server which would instantly transfer to the remote storage on rackspace. I have looked a little into zfs and it's ability to do send/receive. That coupled with a pipe of the data in bzip or something would seem perfect. However, it seems that implementing a nexenta server with zfs would essentially require at least one or two more dedicated storage servers to serve iSCSI block volumes (via zvol's???) to the proxmox servers. I would prefer to keep the setup as minimal as possible (i.e. NOT having separate storage servers) if at all possible. I have also briefly read about zumastor. It looks like it could also do what I want, but it appears to have halted development in 2008. So, zfs, zumastor or other?

    Read the article

  • Web Service gets unavailable after several concurrent calls

    - by Roman
    We are testing GoDaddy Virtual Data Center and came to a very strange issue when our web site gets unavailable. GoDaddy Support keeps saying the issue is in our web server settings, but looking at the result of our tests I doubt it. TEST ENVIRONMENT Virtual DataCenter with Windows hosted at GoDaddy.com. All servers have Windows Server 2008 R2 Datacenter, IIS 7. Server One with IP address 10.1.0.4 Server Two with IP address 10.1.0.3 Both servers are in private network not visible from outside. Port Forward with IP address 50.62.13.174. Port Forward is assigned to Server One TEST DESCRIPTION JMeter is used as a Client App to simulate 30 concurrent users sending 100 SOAP requests each. Interval between requests is 1 second. Http link used for testing: http://50.62.13.174/v2/webservices.asmx TEST ONE Test is run from a computer in our office. After JMeter starts running test, almost immediately, the link above becomes unavailable in a browser. After test completion, the link is not available in a browser for about 5 more minutes. Remote Desktop is working well, so we can connect to Server One remotely. After about 5 minutes since test completion, the link becomes available in a browser again. TEST TWO Test is run from Server Two (that is part of our virtual data center). Test works very well, no visible delays in processing. The link is available in a browser all the time. TEST THREE Test is run from Server One using localhost. The result is the same as in TEST TWO - no issues. TEST FOUR We repeated TEST ONE from other computers that we have located in different countries, all with the same result as TEST ONE. CONCLUSION As the test works well from Server Two, but does not work from outside our virtual data center, we feel there are issues with the network or its capacity. The whole behaviour looks like out requests from outside get stuck somewhere before reaching our virtual data center. Has anybody had similar issues in the past? Are there chances that something is wrong with our server settings?

    Read the article

  • Why might one host be unable to access the Internet, when it can ping the router and when all other hosts can?

    - by user1444233
    I have a Draytek Vigor 2830n. It's kicking out a 192.168.3.0 LAN. It performs load-balancing across dual-WAN ports, although I've turned off the second WAN to simplify testing. There are many hosts on the LAN. All IPs are allocated through DHCP, most freely allocated from the pool, but one or two are bound to NIC MAC addresses. All hosts can access the Internet, save one. That host (192.168.3.100 or 'dot100' for short) gets allocated an IP address (and the right gateway address, DNS server addresses, subnet etc.) dot100 can ping itself. It can ping the gateway, and access the latter's web interface via port 80. It's responsive and loss-free (sustained ping over a couple of minutes reports no data loss). Yet, for some reason that evades me, dot100 can't ping an external IP address or domain name. I suspect it's never been able to, because it was getting some Internet access from a second adaptor (different subnet), but that's now been turned off, which exposed the problem. In dot100, I've tried: two operating systems (Windows 8 and Knoppix), to rule out anti-virus programs etc. two physical adaptors two cables, on each adaptor two IPs (e.g. .100 and .103 assigned by Mac and .26 from the pool) both dynamic and assigned (MAC-bound) DHCP-allocated IPs but none of this experiments yielded any variation in the result. dot100 is a crucial host. It's a file server for the network, so I need it to be reliably allocated a consistent IP. Can anyone offer a potential solution or a way forward with the analysis please? My guess My analysis so far leads me to believe it's a router issue. I've checked the web interface very carefully. There are no filters setup in Firewall - General Setup or Filter Setup. I suspect it's a corrupted internal routing table, but the web UI shows this as the Routing table: Key: C - connected, S - static, R - RIP, * - default, ~ - private * 0.0.0.0/ 0.0.0.0 via 62.XX.XX.X WAN1 * 62.XX.XX.X/ 255.255.255.255 via 62.XX.XX.X WAN1 S 82.YY.YYY.YYY/ 255.255.255.255 via 82.YY.YYY.YYY WAN1 C 192.168.1.0/ 255.255.255.0 directly connected WAN2 C~ 192.168.3.0/ 255.255.255.0 directly connected LAN2

    Read the article

< Previous Page | 167 168 169 170 171 172 173 174 175 176 177 178  | Next Page >