Search Results

Search found 829 results on 34 pages for 'scalable'.

Page 1/34 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • dynamically horizontal scalable key value store

    - by Zubair
    Hi, Is there a key value store that will give me the following: Allow me to simply add and remove nodes and will redstribute the data automatically Allow me to remove nodes and still have 2 extra data nodes to provide redundancy Allow me to store text or images up to 1GB in size Can store small size data up to 100TB of data Fast (so will allow queries to be performed on top of it) Make all this transparent to the client Works on Ubuntu/FreeBSD or Mac Free or open source I basically want something I can use a "single", and not have to worry about having memcached, a db, and several storage components so yes, I do want a database "silver bullet" you could say. Thanks Zubair Answers so far: MogileFS on top of BackBlaze - As far as I can see this is just a filesystem, and after some research it only seems to be appropriate for large image files Tokyo Tyrant - Needs lightcloud. This doesn't auto scale as you add new nodes. I did look into this and it seems it is very fast for queries which fit onto a single node though Riak - This is one I am looking into myself, but I don't have any results yet Amazon S3 - Is anyone using this as their sole persistance layer in production? From what I have seen it seems to be used for storage of images as complex queries are too expensive @shaman suggested Cassandra - definitely one I am looking into So far it seems that there is no database or key value store that fulfills the criteria I mentioned, not even after offering a bounty of 100 points did the question get answered!

    Read the article

  • Best approach for a scalable PHP (AJAX based) chat system

    - by Simon
    Hi, I'm building a chat system for a company and I'm wondering as to what the best way to build the system would be? The current setup we have is a Nginx HTTP Server with PHP and Memcacheq (as a message queue that appends the chat messages to the user's own queue). We then poll the Nginx server (through a Comet style request) and query the message queue for updates. Is it a good idea to use a message queue such as Memcacheq to handle a chat system that has both user-to-user and site-wide chat or is it best to just stick to MySQL? Thanks!

    Read the article

  • Google I/O 2010 - Making smart & scalable Wave robots

    Google I/O 2010 - Making smart & scalable Wave robots Google I/O 2010 - Making smart & scalable Wave robots Wave 201 David Byttow, Marcel Prasetya A smart robot must be able to store persistent data. Wave robots can store data in wave structures, like wavelets, datadocs, and annotations, instead of traditional datastores. A scalable robot must perform operations with minimal bandwidth. Wave robots can optimize by selecting the appropriate amount of context, the optimal events, and narrow filters for events. In this talk, we'll share best practices on data storage and scaling. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 9 0 ratings Time: 58:25 More in Science & Technology

    Read the article

  • Building massively scalable systems, where to start? [closed]

    - by Mahmoud Hossam
    Recently, I've been seeing these job postings about building scalable systems using Java, and some of the technologies mentioned were: Cassandra Thrift Hadoop MapReduce Among others. How can I get started with these technologies? Is there something else I need to know before actually learning any of these technologies? Maybe some general concepts about building highly available and scalable systems? I already know Java SE, so I won't be starting from scratch.

    Read the article

  • How to design a scalable notification system?

    - by Trent
    I need to write a notification system manager. Here is my requirements: I need to be able to send a Notification on different platforms, which may be totally different (for exemple, I need to be able to send either an SMS or an E-mail). Sometimes the notification may be the same for all recipients for a given platform, but sometimes it may be a notification per recipients (or several) per platform. Each notification can contain platform specific payload (for exemple an MMS can contains a sound or an image). The system need to be scalable, I need to be able to send a very large amount of notification without crashing either the application or the server. It is a two step process, first a customer may type a message and choose a platform to send to, and the notification(s) should be created to be processed either real-time either later. Then the system needs to send the notification to the platform provider. For now, I end up with some though but I don't know how scalable it will be or if it is a good design. I've though of the following objects (in a pseudo language): a generic Notification object: class Notification { String $message; Payload $payload; Collection<Recipient> $recipients; } The problem with the following objects is what if I've 1.000.000 recipients ? Even if the Recipient object is very small, it'll take too much memory. I could also create one Notification per recipient, but some platform providers requires me to send it in batch, meaning I need to define one Notification with several Recipients. Each created notification could be stored in a persistent storage like a DB or Redis. Would it be a good it to aggregate this later to make sure it is scalable? On the second step, I need to process this notification. But how could I distinguish the notification to the right platform provider? Should I use an object like MMSNotification extending an abstract Notification? or something like Notification.setType('MMS')? To allow to process a lot of notification at the same time, I think a messaging queue system like RabbitMQ may be the right tool. Is it? It would allow me to queue a lot of notification and have several worker to pop notification and process them. But what if I need to batch the recipients as seen above? Then I imagine a NotificationProcessor object for which I could I add NotificationHandler each NotificationHandler would be in charge to connect the platform provider and perform notification. I can also use an EventManager to allow pluggable behavior. Any feedbacks or ideas? Thanks for giving your time. Note: I'm used to work in PHP and it is likely the language of my choice.

    Read the article

  • Edinburgh this Thurs (25th) - Rob Carrol talks about how to build a high performance, scalable repor

    - by tonyrogerson
    Scottish Area SQL Server User Group Meeting, Edinburgh - Thursday 25th March An evening of SQL Server 2008 Reporting Services Scalability and Performance with Rob Carrol, see how to build a high performance, scalable reporting platform and the tuning techniques required to ensure that report performance remains optimal as your platform grows. Pizza and drinks will be provided! Register at http://www.sqlserverfaq.com/events/221/SQL-Server-2008-Reporting-Services-Scalability-and-Performance.aspx...(read more)

    Read the article

  • Move Data into the grid for scalable, predictable response times

    - by JuergenKress
    CloudTran is pleased to introduce the availability of the CloudTran Transaction and Persistence Manager for creating scalable, reliable data services on the Oracle Coherence In-Memory Data Grid (IMDG). Use of IMDG architectures has been key to handling today’s web-scale loads because it eliminates database latency by storing important and frequently access data in memory instead of on disk. The CloudTran product lets developers easily use an IMDG for full ACID-compliant transactions without having to be concerned about the location or spread of data. The system has its own implementation of fast, scalable distributed transactions that does NOT depend on XA protocols but still guarantees all ACID properties. Plus, CloudTran asynchronously replicates data going into the IMDG to back-end datastores and back-up data centers, again ensuring ACID properties. CloudTran can be accessed through Java Persistence API (JPA via TopLink Grid) and now, through a new Low-Level API, or LLAPI. This is ideal for use in SOA applications that need data reliability, high availability, performance, and scalability. It is still in its limited beta release, the LLAPI gives developers the ability to use standard put/remove logic available in Coherence and then wrap logic with simple Spring annotations or XML+AspectJ to start transactions. An important feature of LLAPI is the ability to join transactions. This is a common outcome for SOA applications that need to reduce network traffic by aggregating data into single cache entries and then doing SOA service processing in the node holding the data. This results in the need to orchestrate transaction processing across multiple service calls. CloudTran has the capability to handle these “multi-client” transactions at speed with no loss in ACID properties. Developing software around an IMDG like Oracle Coherence is an important choice for today’s web-scale applications and services. But this introduces new architectural considerations to maintain scalability in light of increased network loads and data movement. Without using CloudTran, developers are faced with an incredibly difficult task to ensure data reliability, availability, performance, and scalability when working with an IMDG. Working with highly distributed data that is entirely volatile while stored in memory presents numerous edge cases where failures can result in data loss. The CloudTran product takes care of all of this, leaving developers with the confidence and peace of mind that all data is processed correctly. For those interested in evaluating the CloudTran product and IMDGs, take a look at this link for more information: http://www.CloudTran.com/downloadAPI.ph , or send your questions to [email protected]. SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit  www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Mix Forum Technorati Tags: CloudTran,data grid,M,SOA Community,Oracle SOA,Oracle BPM,BPM,Community,OPN,Jürgen Kress

    Read the article

  • Move Data into the Grid for Scalable, Predictable Response Times

    - by JuergenKress
    CloudTran is pleased to introduce the availability of the CloudTran Transaction and Persistence Manager for creating scalable, reliable data services on the Oracle Coherence In-Memory Data Grid (IMDG). Use of IMDG architectures has been key to handling today’s web-scale loads because it eliminates database latency by storing important and frequently access data in memory instead of on disk. The CloudTran product lets developers easily use an IMDG for full ACID-compliant transactions without having to be concerned about the location or spread of data. The system has its own implementation of fast, scalable distributed transactions that does NOT depend on XA protocols but still guarantees all ACID properties. Plus, CloudTran asynchronously replicates data going into the IMDG to back-end datastores and back-up data centers, again ensuring ACID properties. CloudTran can be accessed through Java Persistence API (JPA via TopLink Grid) and now, through a new Low-Level API, or LLAPI. This is ideal for use in SOA applications that need data reliability, high availability, performance, and scalability. Still in limited beta release, the LLAPI gives developers the ability to use standard put/remove logic available in Coherence and then wrap logic with simple Spring annotations or XML+AspectJ to start transactions. An important feature of LLAPI is the ability to join transactions. This is a common outcome for SOA applications that need to reduce network traffic by aggregating data into single cache entries and then doing SOA service processing in the node holding the data. This results in the need to orchestrate transaction processing across multiple service calls. CloudTran has the capability to handle these “multi-client” transactions at speed with no loss in ACID properties. Developing software around an IMDG like Oracle Coherence is an important choice for today’s web-scale applications and services. But this introduces new architectural considerations to maintain scalability in light of increased network loads and data movement. Without using CloudTran, developers are faced with an incredibly difficult task to ensure data reliability, availability, performance, and scalability when working with an IMDG. Working with highly distributed data that is entirely volatile while stored in memory presents numerous edge cases where failures can result in data loss. The CloudTran product takes care of all of this, leaving developers with the confidence and peace of mind that all data is processed correctly. For those interested in evaluating the CloudTran product and IMDGs, take a look at this link for more information: http://www.CloudTran.com/downloadAPI.php, or, send your questions to [email protected]. WebLogic Partner Community For regular information become a member in the WebLogic Partner Community please visit: http://www.oracle.com/partners/goto/wls-emea ( OPN account required). If you need support with your account please contact the Oracle Partner Business Center. BlogTwitterLinkedInMixForumWiki Technorati Tags: Coherence,cloudtran,cache,WebLogic Community,Oracle,OPN,Jürgen Kress

    Read the article

  • Scalable web-hosting for a youtube-like service (no, not porn) [closed]

    - by Crawling Pasta Hellion
    Possible Duplicate: How to find web hosting that meets my requirements? My business partner and I are looking for a European web-hosting service (we are situated in Europe). That service needs to be, needs to have: international servers, a server for each continent at the very least. a high amount of bandwidth. highly scalable, since we are expecting to start off small, but as our user base grows so will everything else (again, no porn or phallic jokes) need to do. a moderate to supreme customer service. of course a small downtime per annum. affordable at first, fair as we grow. I think that is all. Any input is greatly appreciated. Thank you in advance.

    Read the article

  • Scalable solution for website polling

    - by Tom Irving
    I'm looking to add push notifications to one of my iOS apps. The app is a client for a website which doesn't offer push notifications. What I've come up with so far: App sends a message to home server when transitioning to background, asking the server to start polling the website for the logged in user. The home server starts a new process to poll for that user. Polling happens every so many seconds / minutes. When the user returns to the iOS app, the app sends a message to the home server to stop polling. The home server kills the process polling for the user. Repeat. The problem is that this soon becomes stupid: 100s of users means 100s of different processes. It's just not scalable in the slighest. What I've written so far is in PHP, using CURL to do the polling and I started with PHP a few days ago, so maybe I'm missing something obvious that could help me with this. Some advice would be great.

    Read the article

  • Scalable Architecture for modern Web Development [on hold]

    - by Jhilke Dai
    I am doing research about Scalable architecture for Web Development, the research is solely to support Modern Web Development with flexible architecture which can scale up/down according to the needs without losing any core functionality. By Modern Web I mean to support all the Devices used to access websites, but the loading mechanism for all devices would be different. My quest of architecture is: For PC: Accessing web in PC is faster but it also depends on the Geo-location, so, the application would check by default the capacity of Internet/Browser and load the page according to it. For Mobile: Most of the mobile design these days either hide information or use different version of same application. eg: facebook uses m.facebook.com which is completely different than PC version. Hiding the things from Mobile using JavaScript or CSS is not a solution as it'll consume the bandwidth and make the application slow. So, my architecture research is about Serving one Application, which has different stack. When the application receives the request it'd send the Packaged Stack to the received request. This way the load time for end users would be faster and maintenance of application for developers would be easier. I am researching about for 4-tier(layered) architecture like: Presentation Layer Application Logic Layer -- The main Logic layer which stores the Presentation Stack Business Logic Layer Data Layer Main Question: Have you come across of similar architecture? If so, then can you list the links here, I'm very much interested to learn about those implementations specially in real world scenario. Have you thought about similar architectures and tried your own ideas, or if you have any ideas regarding this, then I urge to share. I am open to any discussions regarding this, so, please feel free to comment/answer.

    Read the article

  • OUM is Flexible and Scalable

    - by user535886
    Flexible and Scalable Traditionally, projects have been focused on satisfying the contents of a requirements document or rigorously conforming to an existing set of work products. Often, especially where iterative and incremental techniques have not been employed, these requirements may be inaccurate, the previous deliverables may be flawed, or the business needs may have changed since the start of the project. Fitness for business purpose, derived from the Dynamic Systems Development Method (DSDM) framework, refers to the focus of delivering necessary functionality within a required timebox. The solution can be more rigorously engineered later, if such an approach is acceptable. Our collective experience shows that applying fit-for-purpose criteria, rather than tight adherence to requirements specifications, results in an information system that more closely meets the needs of the business. In OUM, this principle is extended to refer to the execution of the method processes themselves. Project managers and practitioners are encouraged to scale OUM to be fit-for-purpose for a given situation. It is rarely appropriate to execute every activity within OUM. OUM provides guidance for determining the core set of activities to be executed, the level of detail targeted in those activities and their associated tasks, and the frequency and type of end user deliverables. The project workplan should be developed from this core. The plan should then be scaled up, rather than tailored down, to the level of discipline appropriate to the identified risks and requirements. Even at the task level, models and work products should be completed only to the level of detail required for them to be fit-for-purpose within the current iteration or, at the project level, to suit the business needs of the enterprise and to meet the contractual obligations that govern the project. OUM provides well defined templates for many of its tasks. Use of these templates is optional as determined by the context of the project. Work products can easily be a model in a repository, a prototype, a checklist, a set of application code, or, in situations where a high degree of agility is warranted, simply the tacit knowledge contained in the brain of an analyst or practitioner. For further reading on agility, see Balancing Agility and Discipline: A guide fro the Perplexed.

    Read the article

  • How should I host our scalable worker processes?

    - by Pieter Breed
    We are designing a new architecture for an enterprise business. The principles we've followed so far is not to develop what you can (possible buy and) deploy, ie, don't reinvent any wheels. In this way we've decided on CQRS, RabbitMQ, Riak and a bunch of other things. We still need to write /some/ business code though and these will be in the form of worker processes, which will consume commands from a message queue and after any side-effects, produce events onto another message queue. The idea behind this is that via the competing-consumers design we will have a scalable design right out of the box. One option is of writing a management infrastructure that will know how to: deploy code instantiate processes kill processes update configuration etc IE provide fault tolerance and scalability. Also, this is exactly what something like GAE and Heroku does for you, but in a public setting and in our organization, public is bad. My question is, is there an out-of-the-box solution that we can use to host our consumers in? Like a private cloud or private platform-as-a-service. Private Heroku or GAE. Is there some kind of software or software product with which we can do all of these things and thereby get scalability and fault tolerance over our consumers?

    Read the article

  • PTLQueue : a scalable bounded-capacity MPMC queue

    - by Dave
    Title: Fast concurrent MPMC queue -- I've used the following concurrent queue algorithm enough that it warrants a blog entry. I'll sketch out the design of a fast and scalable multiple-producer multiple-consumer (MPSC) concurrent queue called PTLQueue. The queue has bounded capacity and is implemented via a circular array. Bounded capacity can be a useful property if there's a mismatch between producer rates and consumer rates where an unbounded queue might otherwise result in excessive memory consumption by virtue of the container nodes that -- in some queue implementations -- are used to hold values. A bounded-capacity queue can provide flow control between components. Beware, however, that bounded collections can also result in resource deadlock if abused. The put() and take() operators are partial and wait for the collection to become non-full or non-empty, respectively. Put() and take() do not allocate memory, and are not vulnerable to the ABA pathologies. The PTLQueue algorithm can be implemented equally well in C/C++ and Java. Partial operators are often more convenient than total methods. In many use cases if the preconditions aren't met, there's nothing else useful the thread can do, so it may as well wait via a partial method. An exception is in the case of work-stealing queues where a thief might scan a set of queues from which it could potentially steal. Total methods return ASAP with a success-failure indication. (It's tempting to describe a queue or API as blocking or non-blocking instead of partial or total, but non-blocking is already an overloaded concurrency term. Perhaps waiting/non-waiting or patient/impatient might be better terms). It's also trivial to construct partial operators by busy-waiting via total operators, but such constructs may be less efficient than an operator explicitly and intentionally designed to wait. A PTLQueue instance contains an array of slots, where each slot has volatile Turn and MailBox fields. The array has power-of-two length allowing mod/div operations to be replaced by masking. We assume sensible padding and alignment to reduce the impact of false sharing. (On x86 I recommend 128-byte alignment and padding because of the adjacent-sector prefetch facility). Each queue also has PutCursor and TakeCursor cursor variables, each of which should be sequestered as the sole occupant of a cache line or sector. You can opt to use 64-bit integers if concerned about wrap-around aliasing in the cursor variables. Put(null) is considered illegal, but the caller or implementation can easily check for and convert null to a distinguished non-null proxy value if null happens to be a value you'd like to pass. Take() will accordingly convert the proxy value back to null. An advantage of PTLQueue is that you can use atomic fetch-and-increment for the partial methods. We initialize each slot at index I with (Turn=I, MailBox=null). Both cursors are initially 0. All shared variables are considered "volatile" and atomics such as CAS and AtomicFetchAndIncrement are presumed to have bidirectional fence semantics. Finally T is the templated type. I've sketched out a total tryTake() method below that allows the caller to poll the queue. tryPut() has an analogous construction. Zebra stripping : alternating row colors for nice-looking code listings. See also google code "prettify" : https://code.google.com/p/google-code-prettify/ Prettify is a javascript module that yields the HTML/CSS/JS equivalent of pretty-print. -- pre:nth-child(odd) { background-color:#ff0000; } pre:nth-child(even) { background-color:#0000ff; } border-left: 11px solid #ccc; margin: 1.7em 0 1.7em 0.3em; background-color:#BFB; font-size:12px; line-height:65%; " // PTLQueue : Put(v) : // producer : partial method - waits as necessary assert v != null assert Mask = 1 && (Mask & (Mask+1)) == 0 // Document invariants // doorway step // Obtain a sequence number -- ticket // As a practical concern the ticket value is temporally unique // The ticket also identifies and selects a slot auto tkt = AtomicFetchIncrement (&PutCursor, 1) slot * s = &Slots[tkt & Mask] // waiting phase : // wait for slot's generation to match the tkt value assigned to this put() invocation. // The "generation" is implicitly encoded as the upper bits in the cursor // above those used to specify the index : tkt div (Mask+1) // The generation serves as an epoch number to identify a cohort of threads // accessing disjoint slots while s-Turn != tkt : Pause assert s-MailBox == null s-MailBox = v // deposit and pass message Take() : // consumer : partial method - waits as necessary auto tkt = AtomicFetchIncrement (&TakeCursor,1) slot * s = &Slots[tkt & Mask] // 2-stage waiting : // First wait for turn for our generation // Acquire exclusive "take" access to slot's MailBox field // Then wait for the slot to become occupied while s-Turn != tkt : Pause // Concurrency in this section of code is now reduced to just 1 producer thread // vs 1 consumer thread. // For a given queue and slot, there will be most one Take() operation running // in this section. // Consumer waits for producer to arrive and make slot non-empty // Extract message; clear mailbox; advance Turn indicator // We have an obvious happens-before relation : // Put(m) happens-before corresponding Take() that returns that same "m" for T v = s-MailBox if v != null : s-MailBox = null ST-ST barrier s-Turn = tkt + Mask + 1 // unlock slot to admit next producer and consumer return v Pause tryTake() : // total method - returns ASAP with failure indication for auto tkt = TakeCursor slot * s = &Slots[tkt & Mask] if s-Turn != tkt : return null T v = s-MailBox // presumptive return value if v == null : return null // ratify tkt and v values and commit by advancing cursor if CAS (&TakeCursor, tkt, tkt+1) != tkt : continue s-MailBox = null ST-ST barrier s-Turn = tkt + Mask + 1 return v The basic idea derives from the Partitioned Ticket Lock "PTL" (US20120240126-A1) and the MultiLane Concurrent Bag (US8689237). The latter is essentially a circular ring-buffer where the elements themselves are queues or concurrent collections. You can think of the PTLQueue as a partitioned ticket lock "PTL" augmented to pass values from lock to unlock via the slots. Alternatively, you could conceptualize of PTLQueue as a degenerate MultiLane bag where each slot or "lane" consists of a simple single-word MailBox instead of a general queue. Each lane in PTLQueue also has a private Turn field which acts like the Turn (Grant) variables found in PTL. Turn enforces strict FIFO ordering and restricts concurrency on the slot mailbox field to at most one simultaneous put() and take() operation. PTL uses a single "ticket" variable and per-slot Turn (grant) fields while MultiLane has distinct PutCursor and TakeCursor cursors and abstract per-slot sub-queues. Both PTL and MultiLane advance their cursor and ticket variables with atomic fetch-and-increment. PTLQueue borrows from both PTL and MultiLane and has distinct put and take cursors and per-slot Turn fields. Instead of a per-slot queues, PTLQueue uses a simple single-word MailBox field. PutCursor and TakeCursor act like a pair of ticket locks, conferring "put" and "take" access to a given slot. PutCursor, for instance, assigns an incoming put() request to a slot and serves as a PTL "Ticket" to acquire "put" permission to that slot's MailBox field. To better explain the operation of PTLQueue we deconstruct the operation of put() and take() as follows. Put() first increments PutCursor obtaining a new unique ticket. That ticket value also identifies a slot. Put() next waits for that slot's Turn field to match that ticket value. This is tantamount to using a PTL to acquire "put" permission on the slot's MailBox field. Finally, having obtained exclusive "put" permission on the slot, put() stores the message value into the slot's MailBox. Take() similarly advances TakeCursor, identifying a slot, and then acquires and secures "take" permission on a slot by waiting for Turn. Take() then waits for the slot's MailBox to become non-empty, extracts the message, and clears MailBox. Finally, take() advances the slot's Turn field, which releases both "put" and "take" access to the slot's MailBox. Note the asymmetry : put() acquires "put" access to the slot, but take() releases that lock. At any given time, for a given slot in a PTLQueue, at most one thread has "put" access and at most one thread has "take" access. This restricts concurrency from general MPMC to 1-vs-1. We have 2 ticket locks -- one for put() and one for take() -- each with its own "ticket" variable in the form of the corresponding cursor, but they share a single "Grant" egress variable in the form of the slot's Turn variable. Advancing the PutCursor, for instance, serves two purposes. First, we obtain a unique ticket which identifies a slot. Second, incrementing the cursor is the doorway protocol step to acquire the per-slot mutual exclusion "put" lock. The cursors and operations to increment those cursors serve double-duty : slot-selection and ticket assignment for locking the slot's MailBox field. At any given time a slot MailBox field can be in one of the following states: empty with no pending operations -- neutral state; empty with one or more waiting take() operations pending -- deficit; occupied with no pending operations; occupied with one or more waiting put() operations -- surplus; empty with a pending put() or pending put() and take() operations -- transitional; or occupied with a pending take() or pending put() and take() operations -- transitional. The partial put() and take() operators can be implemented with an atomic fetch-and-increment operation, which may confer a performance advantage over a CAS-based loop. In addition we have independent PutCursor and TakeCursor cursors. Critically, a put() operation modifies PutCursor but does not access the TakeCursor and a take() operation modifies the TakeCursor cursor but does not access the PutCursor. This acts to reduce coherence traffic relative to some other queue designs. It's worth noting that slow threads or obstruction in one slot (or "lane") does not impede or obstruct operations in other slots -- this gives us some degree of obstruction isolation. PTLQueue is not lock-free, however. The implementation above is expressed with polite busy-waiting (Pause) but it's trivial to implement per-slot parking and unparking to deschedule waiting threads. It's also easy to convert the queue to a more general deque by replacing the PutCursor and TakeCursor cursors with Left/Front and Right/Back cursors that can move either direction. Specifically, to push and pop from the "left" side of the deque we would decrement and increment the Left cursor, respectively, and to push and pop from the "right" side of the deque we would increment and decrement the Right cursor, respectively. We used a variation of PTLQueue for message passing in our recent OPODIS 2013 paper. ul { list-style:none; padding-left:0; padding:0; margin:0; margin-left:0; } ul#myTagID { padding: 0px; margin: 0px; list-style:none; margin-left:0;} -- -- There's quite a bit of related literature in this area. I'll call out a few relevant references: Wilson's NYU Courant Institute UltraComputer dissertation from 1988 is classic and the canonical starting point : Operating System Data Structures for Shared-Memory MIMD Machines with Fetch-and-Add. Regarding provenance and priority, I think PTLQueue or queues effectively equivalent to PTLQueue have been independently rediscovered a number of times. See CB-Queue and BNPBV, below, for instance. But Wilson's dissertation anticipates the basic idea and seems to predate all the others. Gottlieb et al : Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors Orozco et al : CB-Queue in Toward high-throughput algorithms on many-core architectures which appeared in TACO 2012. Meneghin et al : BNPVB family in Performance evaluation of inter-thread communication mechanisms on multicore/multithreaded architecture Dmitry Vyukov : bounded MPMC queue (highly recommended) Alex Otenko : US8607249 (highly related). John Mellor-Crummey : Concurrent queues: Practical fetch-and-phi algorithms. Technical Report 229, Department of Computer Science, University of Rochester Thomasson : FIFO Distributed Bakery Algorithm (very similar to PTLQueue). Scott and Scherer : Dual Data Structures I'll propose an optimization left as an exercise for the reader. Say we wanted to reduce memory usage by eliminating inter-slot padding. Such padding is usually "dark" memory and otherwise unused and wasted. But eliminating the padding leaves us at risk of increased false sharing. Furthermore lets say it was usually the case that the PutCursor and TakeCursor were numerically close to each other. (That's true in some use cases). We might still reduce false sharing by incrementing the cursors by some value other than 1 that is not trivially small and is coprime with the number of slots. Alternatively, we might increment the cursor by one and mask as usual, resulting in a logical index. We then use that logical index value to index into a permutation table, yielding an effective index for use in the slot array. The permutation table would be constructed so that nearby logical indices would map to more distant effective indices. (Open question: what should that permutation look like? Possibly some perversion of a Gray code or De Bruijn sequence might be suitable). As an aside, say we need to busy-wait for some condition as follows : "while C == 0 : Pause". Lets say that C is usually non-zero, so we typically don't wait. But when C happens to be 0 we'll have to spin for some period, possibly brief. We can arrange for the code to be more machine-friendly with respect to the branch predictors by transforming the loop into : "if C == 0 : for { Pause; if C != 0 : break; }". Critically, we want to restructure the loop so there's one branch that controls entry and another that controls loop exit. A concern is that your compiler or JIT might be clever enough to transform this back to "while C == 0 : Pause". You can sometimes avoid this by inserting a call to a some type of very cheap "opaque" method that the compiler can't elide or reorder. On Solaris, for instance, you could use :"if C == 0 : { gethrtime(); for { Pause; if C != 0 : break; }}". It's worth noting the obvious duality between locks and queues. If you have strict FIFO lock implementation with local spinning and succession by direct handoff such as MCS or CLH,then you can usually transform that lock into a queue. Hidden commentary and annotations - invisible : * And of course there's a well-known duality between queues and locks, but I'll leave that topic for another blog post. * Compare and contrast : PTLQ vs PTL and MultiLane * Equivalent : Turn; seq; sequence; pos; position; ticket * Put = Lock; Deposit Take = identify and reserve slot; wait; extract & clear; unlock * conceptualize : Distinct PutLock and TakeLock implemented as ticket lock or PTL Distinct arrival cursors but share per-slot "Turn" variable provides exclusive role-based access to slot's mailbox field put() acquires exclusive access to a slot for purposes of "deposit" assigns slot round-robin and then acquires deposit access rights/perms to that slot take() acquires exclusive access to slot for purposes of "withdrawal" assigns slot round-robin and then acquires withdrawal access rights/perms to that slot At any given time, only one thread can have withdrawal access to a slot at any given time, only one thread can have deposit access to a slot Permissible for T1 to have deposit access and T2 to simultaneously have withdrawal access * round-robin for the purposes of; role-based; access mode; access role mailslot; mailbox; allocate/assign/identify slot rights; permission; license; access permission; * PTL/Ticket hybrid Asymmetric usage ; owner oblivious lock-unlock pairing K-exclusion add Grant cursor pass message m from lock to unlock via Slots[] array Cursor performs 2 functions : + PTL ticket + Assigns request to slot in round-robin fashion Deconstruct protocol : explication put() : allocate slot in round-robin fashion acquire PTL for "put" access store message into slot associated with PTL index take() : Acquire PTL for "take" access // doorway step seq = fetchAdd (&Grant, 1) s = &Slots[seq & Mask] // waiting phase while s-Turn != seq : pause Extract : wait for s-mailbox to be full v = s-mailbox s-mailbox = null Release PTL for both "put" and "take" access s-Turn = seq + Mask + 1 * Slot round-robin assignment and lock "doorway" protocol leverage the same cursor and FetchAdd operation on that cursor FetchAdd (&Cursor,1) + round-robin slot assignment and dispersal + PTL/ticket lock "doorway" step waiting phase is via "Turn" field in slot * PTLQueue uses 2 cursors -- put and take. Acquire "put" access to slot via PTL-like lock Acquire "take" access to slot via PTL-like lock 2 locks : put and take -- at most one thread can access slot's mailbox Both locks use same "turn" field Like multilane : 2 cursors : put and take slot is simple 1-capacity mailbox instead of queue Borrow per-slot turn/grant from PTL Provides strict FIFO Lock slot : put-vs-put take-vs-take at most one put accesses slot at any one time at most one put accesses take at any one time reduction to 1-vs-1 instead of N-vs-M concurrency Per slot locks for put/take Release put/take by advancing turn * is instrumental in ... * P-V Semaphore vs lock vs K-exclusion * See also : FastQueues-excerpt.java dice-etc/queue-mpmc-bounded-blocking-circular-xadd/ * PTLQueue is the same as PTLQB - identical * Expedient return; ASAP; prompt; immediately * Lamport's Bakery algorithm : doorway step then waiting phase Threads arriving at doorway obtain a unique ticket number Threads enter in ticket order * In the terminology of Reed and Kanodia a ticket lock corresponds to the busy-wait implementation of a semaphore using an eventcount and a sequencer It can also be thought of as an optimization of Lamport's bakery lock was designed for fault-tolerance rather than performance Instead of spinning on the release counter, processors using a bakery lock repeatedly examine the tickets of their peers --

    Read the article

  • SQL SERVER – Weekend Project – Experimenting with ACID Transactions, SQL Compliant, Elastically Scalable Database

    - by pinaldave
    Database technology is huge and big world. I like to explore always beyond what I know and share the learning. Weekend is the best time when I sit around download random software on my machine which I like to call as a lab machine (it is a pretty old laptop, hardly a quality as lab machine) and experiment it. There are so many free betas available for download that it’s hard to keep track and even harder to find the time to play with very many of them.  This blog is about one you shouldn’t miss if you are interested in the learning various relational databases. NuoDB just released their Beta 7.  I had already downloaded their Beta 6 and yesterday did the same for 7.   My impression is that they are onto something very very interesting.  In fact, it might be something really promising in terms of database elasticity, scale and operational cost reduction. The folks at NuoDB say they are working on the world’s first “emergent” database which they tout as a brand new transitional database that is intended to dramatically change what’s possible with OLTP.  It is SQL compliant, guarantees ACID transactions, yet scales elastically on heterogeneous and decentralized cloud-based resources. Interesting note for sure, making me explore more. Based on what I’ve seen so far, they are solving the architectural challenge that exists between elastic, cloud-based compute infrastructures designed to scale out in response to workload requirements versus the traditional relational database management system’s architecture of central control. Here’s my experience with the NuoDB Beta 6 so far: First they pretty much threw away all the features you’d associate with existing RDBMS architectures except the SQL and ACID transactions which they were smart to keep.  It looks like they have incorporated a number of the big ideas from various algorithms, systems and techniques to achieve maximum DB scalability. From a user’s perspective, the NuoDB Beta software behaves like any other traditional SQL database and seems to offer all the benefits users have come to expect from standards-based SQL solutions. One of the interesting feature is that one can run a transactional node and a storage node on my Windows laptop as well on other platforms – indeed interesting for sure. It’s quite amazing to see a database elastically scale across machine boundaries. So, one of the basic NuoDB concepts is that as you need to scale out, you can easily use more inexpensive hardware when/where you need it.  This is unlike what we have traditionally done to scale a database for an application – we replace the hardware with something more powerful (faster CPU and Disks). This is where I started to feel like NuoDB is on to something that has the potential to elastically scale on commodity hardware while reducing operational expense for a big OLTP database to a degree we’ve never seen before. NuoDB is able to fully leverage the cloud in an asynchronous and highly decentralized manner – while providing both SQL compliance and ACID transactions. Basically what NuoDB is doing is so new that it is all hard to believe until you’ve experienced it in action.  I will keep you up to date as I test the NuoDB Beta 7 but if you are developing a web-scale application or have an on-premise app you are thinking of moving to the cloud, testing this beta is worth your time. If you do try it, let me know what you think.  Before I say anything more, I am going to do more experiments and more test on this product and compare it with other existing similar products. For me it was a weekend worth spent on learning something new. I encourage you to download Beta 7 version and share your opinions here. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Running TeamCity from Amazon EC2 - Cloud based scalable build and continuous Integration

    Ive been having fun playing with the amazon EC2 cloud service. I set up a server running TeamCity, and an image of a server that just runs a TeamCity agent. I also setup TeamCity  to automatically instantiate agents on EC2 and shut them down based upon availability of free agents. Heres how I did it: The first step was setting up the teamcity server. Create an account on amazon EC2 (BTW, amazons sites works better in IE than it does in chrome.. who knew!?) Open the EC2 dashboard, and...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Running TeamCity from Amazon EC2 - Cloud based scalable build and continuous Integration

    Ive been having fun playing with the amazon EC2 cloud service. I set up a server running TeamCity, and an image of a server that just runs a TeamCity agent. I also setup TeamCity  to automatically instantiate agents on EC2 and shut them down based upon availability of free agents. Heres how I did it: The first step was setting up the teamcity server. Create an account on amazon EC2 (BTW, amazons sites works better in IE than it does in chrome.. who knew!?) Open the EC2 dashboard, and...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Running TeamCity from Amazon EC2 - Cloud based scalable build and continuous Integration

    - by RoyOsherove
    I’ve been having fun playing with the amazon EC2 cloud service. I set up a server running TeamCity, and an image of a server that just runs a TeamCity agent. I also setup TeamCity  to automatically instantiate agents on EC2 and shut them down based upon availability of free agents. Here’s how I did it: The first step was setting up the teamcity server. Create an account on amazon EC2 (BTW, amazon’s sites works better in IE than it does in chrome.. who knew!?) Open the EC2 dashboard, and click “Launch Instance” . From the “Quick Start” tab I selected from the list: “Getting Started on Microsoft Windows Server 2008 (AMI Id: ami-c5e40dac)” .  it’s good enough to just run teamcity. In the instance details, I used the default (Small instance, 1.7 GB mem). You might want to choose a close availability zone based on where you are. We want to “Launch instances” so click continue. Select the default kernel, RAM disk and all. No need to enable monitoring for now (you can do that later). click continue. If you don’t have a key pair, you will be prompted to create one. Once you do, select it in the list. Now you’ll be prompted to create a security group. I named mine “TC” as in “TeamCity”. each group is a bunch of settings on which ports can be let through into and out of a hosted machine.  keep it as the default settings. We will change them later. Click continue,  review and then click “Launch”. Now you’ll be able to see the new instance in the running instances list on your site. Now, you need to install stuff on that instance (TeamCity!) . To do that, you’ll need to Remote desktop into that instance. To do that, we’ll get the admin password for that instance: Check it on the list, and click “Instance Actions” - “Get Windows Admin Password”. You might have to wait about 10 minutes or so for the password to be generated for you. Once you have the password, you will remote desktop (start-run-‘mstsc’) into the instance. It’s address is a dns address shown below the list under “Public DNS”. it looks something like: ec2-256-226-194-91.compute-1.amazonaws.com Once you’re inside the instance – you’ll need to open IE (it is in hardened mode so you’ll have to relax its security settings to download stuff). I first downloaded chrome and using chrome I downloaded TeamCity. Note that the download speed is FAST. several MBs per second. To be able to see TeamCity from the outside, you will need to open the advanced firewall settings inside the remote machine, and add incoming and outgoing rules for port 80 (HTTP). Once you do that, you should be able to see the machine from the outside. If you still can’t, see the next step. I also enabled ports 9090 since I will use this machine to create an agent image later as well. Now configure the security group (TC) to enable talking to agents: IN the EC2 dashboard click on “Security Groups” and select your group. To add a rule, click on the empty list under the ‘protocol’ header. select TCP. from and ‘to’ ports are 9090. source ip is 0.0.0.0/0 (every ip is allowed). click “Save.  Also make sure you can see “HTTP” tcp 80 in that list. if you can’t see it, add it or you won’t be able to browse to the machine’s teamcity server home page. I also set an elastic IP for the machine: so I always have the same IP for the machine instance. Allocate and set one through the”Elastic IP” link on the EC2 dashboard.   you should now have a working instance of teamcity.   Now let’s create an agent image. Repeat steps 1-9, but this time, make sure you select a machine that fits what an agent might do. I selected Instance type – Hihg-CPU medium machine,  that is much faster. On that machine, I installed what I needed (VS 2010, PostSharp etc..). downloading VS 2010 from MSDN (2 GB took less than 10 min!) Now, instead of installing teamcity, browse using the browser to the teamcity homepage (from within the remote machine). go to the Administration page, and click the upper right link “Install agents”. Install the agent on he local machine – set it to the IP or DNS of the running TeamCity server. That way you’ll be able to check their connectivity live before making this machine your official agent image to reuse. Once the agent is installed, see that the TC server can see it and use it. see steps 13-14 above if they can’t. Once it works, you can take steps to make this image your agent image to be reused. next, here is a copy-paste of several steps to take from http://confluence.jetbrains.net/display/TCD5/Setting+Up+TeamCity+for+Amazon+EC2 Configure system so that agent it is started on machine boot (and make sure TeamCity server is accessible on machine boot). Test the setup by rebooting machine and checking that the agent connects normally to the server. Prepare the Image for bundling: Remove any temporary/history information in the system. Stop the agent (under Windows stop the service but leave it in Automatic startup type) Delete content agent logs and temp directories (not necessary) Delete "<Agent Home>/conf/amazon-*" file (not necessary) Change config/buildAgent.properties to remove properties: name, serverAddress, authToken (not necessary)   Now, we need to: Make AMI from the running instance. Configure TeamCity EC2 support on TeamCity server. Making an AMI: Check the instance of the agent in the EC2 dashboard instance list, and select instance actions->Create Image (EBS AMI) you’ll see the image pending in the APIs list in the EC2 dashboard. this could take 30 minutes or more. meanwhile we can configure the could support in the teamcity server. COPY THE AMI ID to the clipboard (looks like ami-a88aa4ce) Configuring TeamCity for Cloud: In TeamCity, click on “Agents” and then on “Cloud” tab. this is where you will control your cloud agents. to configure new cloud agents based on APIs, click on the right link to the “configuration page” Create a new profile and select AMazon EC2 as cloud type. Use your AMI ID that you copied to the clipboard into the “Images” field. Select an availability zone that is the same as the one your instance is running on for best communication perf between them make sure you select the ‘TC’ security group hopefully, that should be it, and teamcity will try to instantiate new instances on demand. Note that it may take around 10 minutes for an agent to become available to teamcity from the time it’s started.

    Read the article

  • SQL SERVER – Core Concepts – Elasticity, Scalability and ACID Properties – Exploring NuoDB an Elastically Scalable Database System

    - by pinaldave
    I have been recently exploring Elasticity and Scalability attributes of databases. You can see that in my earlier blog posts about NuoDB where I wanted to look at Elasticity and Scalability concepts. The concepts are very interesting, and intriguing as well. I have discussed these concepts with my friend Joyti M and together we have come up with this interesting read. The goal of this article is to answer following simple questions What is Elasticity? What is Scalability? How ACID properties vary from NOSQL Concepts? What are the prevailing problems in the current database system architectures? Why is NuoDB  an innovative and welcome change in database paradigm? Elasticity This word’s original form is used in many different ways and honestly it does do a decent job in holding things together over the years as a person grows and contracts. Within the tech world, and specifically related to software systems (database, application servers), it has come to mean a few things - allow stretching of resources without reaching the breaking point (on demand). What are resources in this context? Resources are the usual suspects – RAM/CPU/IO/Bandwidth in the form of a container (a process or bunch of processes combined as modules). When it is about increasing resources the simplest idea which comes to mind is the addition of another container. Another container means adding a brand new physical node. When it is about adding a new node there are two questions which comes to mind. 1) Can we add another node to our software system? 2) If yes, does adding new node cause downtime for the system? Let us assume we have added new node, let us see what the new needs of the system are when a new node is added. Balancing incoming requests to multiple nodes Synchronization of a shared state across multiple nodes Identification of “downstate” and resolution action to bring it to “upstate” Well, adding a new node has its advantages as well. Here are few of the positive points Throughput can increase nearly horizontally across the node throughout the system Response times of application will increase as in-between layer interactions will be improved Now, Let us put the above concepts in the perspective of a Database. When we mention the term “running out of resources” or “application is bound to resources” the resources can be CPU, Memory or Bandwidth. The regular approach to “gain scalability” in the database is to look around for bottlenecks and increase the bottlenecked resource. When we have memory as a bottleneck we look at the data buffers, locks, query plans or indexes. After a point even this is not enough as there needs to be an efficient way of managing such large workload on a “single machine” across memory and CPU bound (right kind of scheduling)  workload. We next move on to either read/write separation of the workload or functionality-based sharing so that we still have control of the individual. But this requires lots of planning and change in client systems in terms of knowing where to go/update/read and for reporting applications to “aggregate the data” in an intelligent way. What we ideally need is an intelligent layer which allows us to do these things without us getting into managing, monitoring and distributing the workload. Scalability In the context of database/applications, scalability means three main things Ability to handle normal loads without pressure E.g. X users at the Y utilization of resources (CPU, Memory, Bandwidth) on the Z kind of hardware (4 processor, 32 GB machine with 15000 RPM SATA drives and 1 GHz Network switch) with T throughput Ability to scale up to expected peak load which is greater than normal load with acceptable response times Ability to provide acceptable response times across the system E.g. Response time in S milliseconds (or agreed upon unit of measure) – 90% of the time The Issue – Need of Scale In normal cases one can plan for the load testing to test out normal, peak, and stress scenarios to ensure specific hardware meets the needs. With help from Hardware and Software partners and best practices, bottlenecks can be identified and requisite resources added to the system. Unfortunately this vertical scale is expensive and difficult to achieve and most of the operational people need the ability to scale horizontally. This helps in getting better throughput as there are physical limits in terms of adding resources (Memory, CPU, Bandwidth and Storage) indefinitely. Today we have different options to achieve scalability: Read & Write Separation The idea here is to do actual writes to one store and configure slaves receiving the latest data with acceptable delays. Slaves can be used for balancing out reads. We can also explore functional separation or sharing as well. We can separate data operations by a specific identifier (e.g. region, year, month) and consolidate it for reporting purposes. For functional separation the major disadvantage is when schema changes or workload pattern changes. As the requirement grows one still needs to deal with scale need in manual ways by providing an abstraction in the middle tier code. Using NOSQL solutions The idea is to flatten out the structures in general to keep all values which are retrieved together at the same store and provide flexible schema. The issue with the stores is that they are compromising on mostly consistency (no ACID guarantees) and one has to use NON-SQL dialect to work with the store. The other major issue is about education with NOSQL solutions. Would one really want to make these compromises on the ability to connect and retrieve in simple SQL manner and learn other skill sets? Or for that matter give up on ACID guarantee and start dealing with consistency issues? Hybrid Deployment – Mac, Linux, Cloud, and Windows One of the challenges today that we see across On-premise vs Cloud infrastructure is a difference in abilities. Take for example SQL Azure – it is wonderful in its concepts of throttling (as it is shared deployment) of resources and ability to scale using federation. However, the same abilities are not available on premise. This is not a mistake, mind you – but a compromise of the sweet spot of workloads, customer requirements and operational SLAs which can be supported by the team. In today’s world it is imperative that databases are available across operating systems – which are a commodity and used by developers of all hues. An Ideal Database Ability List A system which allows a linear scale of the system (increase in throughput with reasonable response time) with the addition of resources A system which does not compromise on the ACID guarantees and require developers to learn new paradigms A system which does not force fit a new way interacting with database by learning Non-SQL dialect A system which does not force fit its mechanisms for providing availability across its various modules. Well NuoDB is the first database which has all of the above abilities and much more. In future articles I will cover my hands-on experience with it. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: NuoDB

    Read the article

  • Highly scalable and dynamic "rule-based" applications?

    - by Prof Plum
    For a large enterprise app, everyone knows that being able to adjust to change is one of the most important aspects of design. I use a rule-based approach a lot of the time to deal with changing business logic, with each rule being stored in a DB. This allows for easy changes to be made without diving into nasty details. Now since C# cannot Eval("foo(bar);") this is accomplished by using formatted strings stored in rows that are then processed in JavaScript at runtime. This works fine, however, it is less than elegant, and would not be the most enjoyable for anyone else to pick up on once it becomes legacy. Is there a more elegant solution to this? When you get into thousands of rules that change fairly frequently it becomes a real bear, but this cannot be that uncommon of a problem that someone has not thought of a better way to do this. Any suggestions? Is this current method defensible? What are the alternatives? Edit: Just to clarify, this is a large enterprise app, so no matter which solution works, there will be plenty of people constantly maintaining its rules and data (around 10). Also, The data changes frequently enough to say that some sort of centralized server system is basically a must.

    Read the article

  • Scalable distributed file system for blobs like images and other documents

    - by Pinnacle
    Cassandra & HBase both do not efficiently support storage of blobs like images. Storing directly on HDFS stresses the Namenode because of huge number of files. Facebook uses Haystack for images and attachments storage, but this is not open source. So is Lustre a good choice for distributed blob storage? I have read that Amazon S3 is used by many, but this would cost money and personally, I would not like to rely on third party system. What are other suggestions?

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >