data dictionary - Page 10

Django dictionary in templates: Grab key from another objects attribute

- by Jordan Messina

I have a dictionary called number_devices I'm passing to a template, the dictionary keys are the ids of a list of objects I'm also passing to the template (called implementations). I'm iterating over the list of objects and then trying to use the object.id to get a value out of the dict like so: {% for implementation in implementations %} {{ number_devices.implementation.id }} {% endfor %} Unfortunately number_devices.implementation is evaluated first, then the result.id is evaluated obviously returning and displaying nothing. I can't use parentheses like: {{ number_devices.(implementation.id) }} because I get a parse error. How do I get around this annoyance in Django templates? Thanks for any help!

Read the article

Find size of dictionary object in VBA

- by Gershwyn

What is the simplest way in VBA to determine the size (i.e., number of keys) in a dictionary object?

Read the article

Fastest way to store/retrieve a dictionary - SQL, text file...?

- by AP257

Hi all, This is a really really super dumb question, so I apologise, but I'd be grateful for some advice. I've got a text file of words and word frequencies. It's very large - theoretically we're talking millions of rows. I just want to retrieve values from the file, and do it as quickly and efficiently as possible (for a web app, in Django). My question is: what is the best way to store and retrieve the values? Should import them into SQL? Or keep the file and use grep? Or put them into a JSON dictionary...? Or some other way? Sorry for the dumb question, would be very grateful for advice!

Read the article

Wherewith replace Dictionary.Where(p => p.Key is T)

- by abatishchev

I have a System.Collections.Generic.Dictionary<System.Web.UI.Control, object> where all keys can be either type of System.Web.UI.WebControls.HyperLink or type of System.Web.UI.WebControls.Label. I want to change Text property of each control. Because HyperLink doesn't implement (why??) ITextControl, I need to cast Label or HyperLink explicitly: Dictionary<Control,object> dic = .. dic .Where(p => p.Key is HyperLink) .ForEach(c => ((HyperLink)c).Text = "something") dic .Where(p => p.Key is Label) .ForEach(c => ((Label)c).Text = "something") Are there ways to workaround such approach?

Read the article

Obtaining Index value of dictionary

- by Maudise

I have a piece of code which looks at the following public Test As Dictionary(Of String, String()) Which is brought in tester = New Dictionary(Of String, String()) tester.add("Key_EN", {"Option 1_EN", "Option 2_EN", "Option 3_EN"}) tester.add("Key_FR", {"Option 1_FR", "Option 2_FR", "Option 3_FR"}) tester.add("Key_DE", {"Option 1_DE", "Option 2_DE", "Option 3_DE"}) There's then a combo box which looks at the following dim Language as string Language = "_EN" ' note this is done by a drop down combo box to select _EN or _FR etc. cboTestBox.items.AddRange(tester("Key" & Language)) What I need to be able to do is to see what index position the answer is in and convert it back to the Key_EN. So, for example _DE is selected, then the options of "Option 1_DE", "Option 2_DE", "Option 3_DE" would be displayed. If they chose Option 3_DE then I need to be able to convert this to Option 3_EN. Many thanks Maudise

Read the article

Django: Converting an entire Model into a single dictionary

- by LarrikJ

Is there a good way in Django to convert an entire model to a dictionary? I mean, like this: class DictModel(models.Model): key = models.CharField(20) value = models.CharField(200) DictModel.objects.all().to_dict() ... with the result being a dictionary with the key/value pairs made up of records in the Model? Has anyone else seen this as being useful for them? Thanks. Update I just wanted to add is that my ultimate goal is to be able to do a simple variable lookup inside a Template. Something like: {{ DictModel.exampleKey }} With a result of DictModel.objects.get(key__exact=exampleKey).value Overall, though, you guys have really surprised me with how helpful allof your responses are, and how different the ways to approach it can be. Thanks a lot.

Read the article

Recover data loss from accidental quick format

- by Bo Tian

I've accidentally performed a quick format on the mobile hard disk thinking that the data is useless. Now my mum says that she needs them back. What are the tools that I could use to recover the data?

Read the article

New Feature in ODI 11.1.1.6: ODI for Big Data

- by Julien Testut

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} By Ananth Tirupattur Starting with Oracle Data Integrator 11.1.1.6.0, ODI is offering a solution to process Big Data. This post provides an overview of this feature. With all the buzz around Big Data and before getting into the details of ODI for Big Data, I will provide a brief introduction to Big Data and Oracle Solution for Big Data. So, what is Big Data? Big data includes: structured data (this includes data from relation data stores, xml data stores), semi-structured data (this includes data from weblogs) unstructured data (this includes data from text blob, images) Traditionally, business decisions are based on the information gathered from transactional data. For example, transactional Data from CRM applications is fed to a decision system for analysis and decision making. Products such as ODI play a key role in enabling decision systems. However, with the emergence of massive amounts of semi-structured and unstructured data it is important for decision system to include them in the analysis to achieve better decision making capability. While there is an abundance of opportunities for business for gaining competitive advantages, process of Big Data has challenges. The challenges of processing Big Data include: Volume of data Velocity of data - The high Rate at which data is generated Variety of data In order to address these challenges and convert them into opportunities, we would need an appropriate framework, platform and the right set of tools. Hadoop is an open source framework which is highly scalable, fault tolerant system, for storage and processing large amounts of data. Hadoop provides 2 key services, distributed and reliable storage called Hadoop Distributed File System or HDFS and a framework for parallel data processing called Map-Reduce. Innovations in Hadoop and its related technology continue to rapidly evolve, hence therefore, it is highly recommended to follow information on the web to keep up with latest information. Oracle's vision is to provide a comprehensive solution to address the challenges faced by Big Data. Oracle is providing the necessary Hardware, software and tools for processing Big Data Oracle solution includes: Big Data Appliance Oracle NoSQL Database Cloudera distribution for Hadoop Oracle R Enterprise- R is a statistical package which is very popular among data scientists. ODI solution for Big Data Oracle Loader for Hadoop for loading data from Hadoop to Oracle. Further details can be found here: http://www.oracle.com/us/products/database/big-data-appliance/overview/index.html ODI Solution for Big Data: ODI’s goal is to minimize the need to understand the complexity of Hadoop framework and simplify the adoption of processing Big Data seamlessly in an enterprise. ODI is providing the capabilities for an integrated architecture for processing Big Data. This includes capability to load data in to Hadoop, process data in Hadoop and load data from Hadoop into Oracle. ODI is expanding its support for Big Data by providing the following out of the box Knowledge Modules (KMs). IKM File to Hive (LOAD DATA).Load unstructured data from File (Local file system or HDFS ) into Hive IKM Hive Control AppendTransform and validate structured data on Hive IKM Hive TransformTransform unstructured data on Hive IKM File/Hive to Oracle (OLH)Load processed data in Hive to Oracle RKM HiveReverse engineer Hive tables to generate models Using the Loading KM you can map files (local and HDFS files) to the corresponding Hive tables. For example, you can map weblog files categorized by date into a corresponding partitioned Hive table schema. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Hive control Append KM you can validate and transform data in Hive. In the below example, two source Hive tables are joined and mapped to a target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} The Hive Transform KM facilitates processing of semi-structured data in Hive. In the below example, the data from weblog is processed using a Perl script and mapped to target Hive table. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Using the Oracle Loader for Hadoop (OLH) KM you can load data from Hive table or HDFS to a corresponding table in Oracle. OLH is available as a standalone product. ODI greatly enhances OLH capability by generating the configuration and mapping files for OLH based on the configuration provided in the interface and KM options. ODI seamlessly invokes OLH when executing the scenario. In the below example, a HDFS file is mapped to a table in Oracle. Development and Deployment:The following diagram illustrates the development and deployment of ODI solution for Big Data. Using the ODI Studio on your development machine create and develop ODI solution for processing Big Data by connecting to a MySQL DB or Oracle database on a BDA machine or Hadoop cluster. Schedule the ODI scenarios to be executed on the ODI agent deployed on the BDA machine or Hadoop cluster. ODI Solution for Big Data provides several exciting new capabilities to facilitate the adoption of Big Data in an enterprise. You can find more information about the Oracle Big Data connectors on OTN. You can find an overview of all the new features introduced in ODI 11.1.1.6 in the following document: ODI 11.1.1.6 New Features Overview

Read the article

What is a good way to keep track of strings for dictionary lookups?

- by Justin

I am working through the Windows 8 app tutorial. They have some code about saving app data like so: private void NameInput_TextChanged(object sender, TextChangedEventArgs e) { Windows.Storage.ApplicationDataContainer roamingSettings = Windows.Storage.ApplicationData.Current.RoamingSettings; roamingSettings.Values["userName"] = nameInput.Text; } I have worked with C# in the past and found that things like using constant string values (like "userName" in this case) for keys could get messy because auto-complete did not work and it was easy to forget if I had made an entry for a setting before and what it was called. So if I don't touch code for awhile I end up accidentally creating multiple entries for the same value that are named slightly differently. Surely there is a better way to keep track of the strings that key to those values. What is a good solution to this problem?

Read the article

Data recovery on a data HDD (no OS)

- by aCuria

I am helping a family member with a dead hard disk. It is a seagate 200Gb 3.5" HDD in one of those old-school external enclosures. The problem was that windows failed to detect the hard disk when plugged in through USB. I removed the hard disk from its enclosure, and plugged it into my desktop PC. The BIOS does detect it upon POST, but unfortunately windows 7 would refuse to boot. It will get stuck on the loading screen with the glowing windows logo. Safe mode doesn't help either. What options do I have before going for some professional data recovery? edit: Someone modified the Title to something completely different from what I was asking, i just changed it back. 1) 2 HDD drives, DiskA(Dead), DiskB(my OS disk) 2) when B is connected to my system, everything works fine 3) when A AND B is connected, failure to boot. POSTs fine, but windows wont load 4) A has NO OS, its PURE data. It came from an EXTERNAL HDD enclosure which doesnt belong to me, and im trying to do data recovery.

Read the article

Python dictionary key missing

- by Greg K

I thought I'd put together a quick script to consolidate the CSS rules I have distributed across multiple CSS files, then I can minify it. I'm new to Python but figured this would be a good exercise to try a new language. My main loop isn't parsing the CSS as I thought it would. I populate a list with selectors parsed from the CSS files to return the CSS rules in order. Later in the script, the list contains an element that is not found in the dictionary. for line in self.file.readlines(): if self.hasSelector(line): selector = self.getSelector(line) if selector not in self.order: self.order.append(selector) elif selector and self.hasProperty(line): # rules.setdefault(selector,[]).append(self.getProperty(line)) property = self.getProperty(line) properties = [] if selector not in rules else rules[selector] if property not in properties: properties.append(property) rules[selector] = properties # print "%s :: %s" % (selector, "".join(rules[selector])) return rules Error encountered: $ css-combine combined.css test1.css test2.css Traceback (most recent call last): File "css-combine", line 108, in <module> c.run(outfile, stylesheets) File "css-combine", line 64, in run [(selector, rules[selector]) for selector in parser.order], KeyError: 'p' Swap the inputs: $ css-combine combined.css test2.css test1.css Traceback (most recent call last): File "css-combine", line 108, in <module> c.run(outfile, stylesheets) File "css-combine", line 64, in run [(selector, rules[selector]) for selector in parser.order], KeyError: '#header_.title' I've done some quirky things in the code like sub spaces for underscores in dictionary key names in case it was an issue - maybe this is a benign precaution? Depending on the order of the inputs, a different key cannot be found in the dictionary. The script: #!/usr/bin/env python import optparse import re class CssParser: def __init__(self): self.file = False self.order = [] # store rules assignment order def parse(self, rules = {}): if self.file == False: raise IOError("No file to parse") selector = False for line in self.file.readlines(): if self.hasSelector(line): selector = self.getSelector(line) if selector not in self.order: self.order.append(selector) elif selector and self.hasProperty(line): # rules.setdefault(selector,[]).append(self.getProperty(line)) property = self.getProperty(line) properties = [] if selector not in rules else rules[selector] if property not in properties: properties.append(property) rules[selector] = properties # print "%s :: %s" % (selector, "".join(rules[selector])) return rules def hasSelector(self, line): return True if re.search("^([#a-z,\.:\s]+){", line) else False def getSelector(self, line): s = re.search("^([#a-z,:\.\s]+){", line).group(1) return "_".join(s.strip().split()) def hasProperty(self, line): return True if re.search("^\s?[a-z-]+:[^;]+;", line) else False def getProperty(self, line): return re.search("([a-z-]+:[^;]+;)", line).group(1) class Consolidator: """Class to consolidate CSS rule attributes""" def run(self, outfile, files): parser = CssParser() rules = {} for file in files: try: parser.file = open(file) rules = parser.parse(rules) except IOError: print "Cannot read file: " + file finally: parser.file.close() self.serialize( [(selector, rules[selector]) for selector in parser.order], outfile ) def serialize(self, rules, outfile): try: f = open(outfile, "w") for rule in rules: f.write( "%s {\n\t%s\n}\n\n" % ( " ".join(rule[0].split("_")), "\n\t".join(rule[1]) ) ) except IOError: print "Cannot write output to: " + outfile finally: f.close() def init(): op = optparse.OptionParser( usage="Usage: %prog [options] <output file> <stylesheet1> " + "<stylesheet2> ... <stylesheetN>", description="Combine CSS rules spread across multiple " + "stylesheets into a single file" ) opts, args = op.parse_args() if len(args) < 3: if len(args) == 1: print "Error: No input files specified.\n" elif len(args) == 2: print "Error: One input file specified, nothing to combine.\n" op.print_help(); exit(-1) return [opts, args] if __name__ == '__main__': opts, args = init() outfile, stylesheets = [args[0], args[1:]] c = Consolidator() c.run(outfile, stylesheets) Test CSS file 1: body { background-color: #e7e7e7; } p { margin: 1em 0em; } File 2: body { font-size: 16px; } #header .title { font-family: Tahoma, Geneva, sans-serif; font-size: 1.9em; } #header .title a, #header .title a:hover { color: #f5f5f5; border-bottom: none; text-shadow: 2px 2px 3px rgba(0, 0, 0, 1); } Thanks in advance.

Read the article

Data Governance 2010 Conference in San Diego

- by Tony Ouk

The Data Governance Annual Conference is one of the world's most authoritative and vendor neutral event on Data Governance and Data Quality. The conference will focus on the "how-tos" from starting a data governance and stewardship program to attaining data governance maturity with specific topics on MDM. This year's event will be hosted June 7 through June 10 in San Diego, California. For more information, including registration details, visit the Data Governance 2010 Conference website.

Read the article

How to search for newline or linebreak characters in Excel?

- by Highly Irregular

I've imported some data into Excel (from a text file) and it contains some sort of newline characters. It looks like this initially: If I hit F2 (to edit) then Enter (to save changes) on each of the cells with a newline (without actually editing anything), Excel automatically changes the layout to look like this: I don't want these newlines characters here, as it messes up data processing further down the track. How can I do a search for these to detect more of them? The usual search function doesn't accept an enter character as a search character.

Read the article

Oracle Data Integration 12c: Simplified, Future-Ready, High-Performance Solutions

- by Thanos Terentes Printzios

In today’s data-driven business environment, organizations need to cost-effectively manage the ever-growing streams of information originating both inside and outside the firewall and address emerging deployment styles like cloud, big data analytics, and real-time replication. Oracle Data Integration delivers pervasive and continuous access to timely and trusted data across heterogeneous systems. Oracle is enhancing its data integration offering announcing the general availability of 12c release for the key data integration products: Oracle Data Integrator 12c and Oracle GoldenGate 12c, delivering Simplified and High-Performance Solutions for Cloud, Big Data Analytics, and Real-Time Replication. The new release delivers extreme performance, increase IT productivity, and simplify deployment, while helping IT organizations to keep pace with new data-oriented technology trends including cloud computing, big data analytics, real-time business intelligence. With the 12c release Oracle becomes the new leader in the data integration and replication technologies as no other vendor offers such a complete set of data integration capabilities for pervasive, continuous access to trusted data across Oracle platforms as well as third-party systems and applications. Oracle Data Integration 12c release addresses data-driven organizations’ critical and evolving data integration requirements under 3 key themes: Future-Ready Solutions : Supporting Current and Emerging Initiatives Extreme Performance : Even higher performance than ever before Fast Time-to-Value : Higher IT Productivity and Simplified Solutions With the new capabilities in Oracle Data Integrator 12c, customers can benefit from: Superior developer productivity, ease of use, and rapid time-to-market with the new flow-based mapping model, reusable mappings, and step-by-step debugger. Increased performance when executing data integration processes due to improved parallelism. Improved productivity and monitoring via tighter integration with Oracle GoldenGate 12c and Oracle Enterprise Manager 12c. Improved interoperability with Oracle Warehouse Builder which enables faster and easier migration to Oracle Data Integrator’s strategic data integration offering. Faster implementation of business analytics through Oracle Data Integrator pre-integrated with Oracle BI Applications’ latest release. Oracle Data Integrator also integrates simply and easily with Oracle Business Analytics tools, including OBI-EE and Oracle Hyperion. Support for loading and transforming big and fast data, enabled by integration with big data technologies: Hadoop, Hive, HDFS, and Oracle Big Data Appliance. Only Oracle GoldenGate provides the best-of-breed real-time replication of data in heterogeneous data environments. With the new capabilities in Oracle GoldenGate 12c, customers can benefit from: Simplified setup and management of Oracle GoldenGate 12c when using multiple database delivery processes via a new Coordinated Delivery feature for non-Oracle databases. Expanded heterogeneity through added support for the latest versions of major databases such as Sybase ASE v 15.7, MySQL NDB Clusters 7.2, and MySQL 5.6., as well as integration with Oracle Coherence. Enhanced high availability and data protection via integration with Oracle Data Guard and Fast-Start Failover integration. Enhanced security for credentials and encryption keys using Oracle Wallet. Real-time replication for databases hosted on public cloud environments supported by third-party clouds. Tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c and other Oracle technologies, such as Oracle Database 12c and Oracle Applications, provides a number of benefits for organizations: Tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c enables developers to leverage Oracle GoldenGate’s low overhead, real-time change data capture completely within the Oracle Data Integrator Studio without additional training. Integration with Oracle Database 12c provides a strong foundation for seamless private cloud deployments. Delivers real-time data for reporting, zero downtime migration, and improved performance and availability for Oracle Applications, such as Oracle E-Business Suite and ATG Web Commerce . Oracle’s data integration offering is optimized for Oracle Engineered Systems and is an integral part of Oracle’s fast data, real-time analytics strategy on Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine. Oracle Data Integrator 12c and Oracle GoldenGate 12c differentiate the new offering on data integration with these many new features. This is just a quick glimpse into Oracle Data Integrator 12c and Oracle GoldenGate 12c. Find out much more about the new release in the video webcast "Introducing 12c for Oracle Data Integration", where customer and partner speakers, including SolarWorld, BT, Rittman Mead will join us in launching the new release. Resource Kits Meet Oracle Data Integration 12c Discover what's new with Oracle Goldengate 12c Oracle EMEA DIS (Data Integration Solutions) Partner Community is available for all your questions, while additional partner focused webcasts will be made available through our blog here, so stay connected. For any questions please contact us at partner.imc-AT-beehiveonline.oracle-DOT-com Stay Connected Oracle Newsletters

Read the article

C#/.NET Little Wonders: The ConcurrentDictionary

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In this series of posts, we will discuss how the concurrent collections have been developed to help alleviate these multi-threading concerns. Last week’s post began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>. Today's post discusses the ConcurrentDictionary<T> (originally I had intended to discuss ConcurrentBag this week as well, but ConcurrentDictionary had enough information to create a very full post on its own!). Finally next week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. Recap As you'll recall from the previous post, the original collections were object-based containers that accomplished synchronization through a Synchronized member. While these were convenient because you didn't have to worry about writing your own synchronization logic, they were a bit too finely grained and if you needed to perform multiple operations under one lock, the automatic synchronization didn't buy much. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization. This cuts both ways in that you have a lot more control as a developer over when and how fine-grained you want to synchronize, but on the other hand if you just want simple synchronization it creates more work. With .NET 4.0, we get the best of both worlds in generic collections. A new breed of collections was born called the concurrent collections in the System.Collections.Concurrent namespace. These amazing collections are fine-tuned to have best overall performance for situations requiring concurrent access. They are not meant to replace the generic collections, but to simply be an alternative to creating your own locking mechanisms. Among those concurrent collections were the ConcurrentStack<T> and ConcurrentQueue<T> which provide classic LIFO and FIFO collections with a concurrent twist. As we saw, some of the traditional methods that required calls to be made in a certain order (like checking for not IsEmpty before calling Pop()) were replaced in favor of an umbrella operation that combined both under one lock (like TryPop()). Now, let's take a look at the next in our series of concurrent collections!For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentDictionary – the fully thread-safe dictionary The ConcurrentDictionary<TKey,TValue> is the thread-safe counterpart to the generic Dictionary<TKey, TValue> collection. Obviously, both are designed for quick – O(1) – lookups of data based on a key. If you think of algorithms where you need lightning fast lookups of data and don’t care whether the data is maintained in any particular ordering or not, the unsorted dictionaries are generally the best way to go. Note: as a side note, there are sorted implementations of IDictionary, namely SortedDictionary and SortedList which are stored as an ordered tree and a ordered list respectively. While these are not as fast as the non-sorted dictionaries – they are O(log2 n) – they are a great combination of both speed and ordering -- and still greatly outperform a linear search. Now, once again keep in mind that if all you need to do is load a collection once and then allow multi-threaded reading you do not need any locking. Examples of this tend to be situations where you load a lookup or translation table once at program start, then keep it in memory for read-only reference. In such cases locking is completely non-productive. However, most of the time when we need a concurrent dictionary we are interleaving both reads and updates. This is where the ConcurrentDictionary really shines! It achieves its thread-safety with no common lock to improve efficiency. It actually uses a series of locks to provide concurrent updates, and has lockless reads! This means that the ConcurrentDictionary gets even more efficient the higher the ratio of reads-to-writes you have. ConcurrentDictionary and Dictionary differences For the most part, the ConcurrentDictionary<TKey,TValue> behaves like it’s Dictionary<TKey,TValue> counterpart with a few differences. Some notable examples of which are: Add() does not exist in the concurrent dictionary. This means you must use TryAdd(), AddOrUpdate(), or GetOrAdd(). It also means that you can’t use a collection initializer with the concurrent dictionary. TryAdd() replaced Add() to attempt atomic, safe adds. Because Add() only succeeds if the item doesn’t already exist, we need an atomic operation to check if the item exists, and if not add it while still under an atomic lock. TryUpdate() was added to attempt atomic, safe updates. If we want to update an item, we must make sure it exists first and that the original value is what we expected it to be. If all these are true, we can update the item under one atomic step. TryRemove() was added to attempt atomic, safe removes. To safely attempt to remove a value we need to see if the key exists first, this checks for existence and removes under an atomic lock. AddOrUpdate() was added to attempt an thread-safe “upsert”. There are many times where you want to insert into a dictionary if the key doesn’t exist, or update the value if it does. This allows you to make a thread-safe add-or-update. GetOrAdd() was added to attempt an thread-safe query/insert. Sometimes, you want to query for whether an item exists in the cache, and if it doesn’t insert a starting value for it. This allows you to get the value if it exists and insert if not. Count, Keys, Values properties take a snapshot of the dictionary. Accessing these properties may interfere with add and update performance and should be used with caution. ToArray() returns a static snapshot of the dictionary. That is, the dictionary is locked, and then copied to an array as a O(n) operation. GetEnumerator() is thread-safe and efficient, but allows dirty reads. Because reads require no locking, you can safely iterate over the contents of the dictionary. The only downside is that, depending on timing, you may get dirty reads. Dirty reads during iteration The last point on GetEnumerator() bears some explanation. Picture a scenario in which you call GetEnumerator() (or iterate using a foreach, etc.) and then, during that iteration the dictionary gets updated. This may not sound like a big deal, but it can lead to inconsistent results if used incorrectly. The problem is that items you already iterated over that are updated a split second after don’t show the update, but items that you iterate over that were updated a split second before do show the update. Thus you may get a combination of items that are “stale” because you iterated before the update, and “fresh” because they were updated after GetEnumerator() but before the iteration reached them. Let’s illustrate with an example, let’s say you load up a concurrent dictionary like this: 1: // load up a dictionary. 2: var dictionary = new ConcurrentDictionary<string, int>(); 3: 4: dictionary["A"] = 1; 5: dictionary["B"] = 2; 6: dictionary["C"] = 3; 7: dictionary["D"] = 4; 8: dictionary["E"] = 5; 9: dictionary["F"] = 6; Then you have one task (using the wonderful TPL!) to iterate using dirty reads: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); And one task to attempt updates in a separate thread (probably): 1: // attempt updates in a separate thread 2: var updateTask = new Task(() => 3: { 4: // iterates, and updates the value by one 5: foreach (var pair in dictionary) 6: { 7: dictionary[pair.Key] = pair.Value + 1; 8: } 9: }); Now that we’ve done this, we can fire up both tasks and wait for them to complete: 1: // start both tasks 2: updateTask.Start(); 3: iterationTask.Start(); 4: 5: // wait for both to complete. 6: Task.WaitAll(updateTask, iterationTask); Now, if I you didn’t know about the dirty reads, you may have expected to see the iteration before the updates (such as A:1, B:2, C:3, D:4, E:5, F:6). However, because the reads are dirty, we will quite possibly get a combination of some updated, some original. My own run netted this result: 1: F:6 2: E:6 3: D:5 4: C:4 5: B:3 6: A:2 Note that, of course, iteration is not in order because ConcurrentDictionary, like Dictionary, is unordered. Also note that both E and F show the value 6. This is because the output task reached F before the update, but the updates for the rest of the items occurred before their output (probably because console output is very slow, comparatively). If we want to always guarantee that we will get a consistent snapshot to iterate over (that is, at the point we ask for it we see precisely what is in the dictionary and no subsequent updates during iteration), we should iterate over a call to ToArray() instead: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary.ToArray()) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); The atomic Try…() methods As you can imagine TryAdd() and TryRemove() have few surprises. Both first check the existence of the item to determine if it can be added or removed based on whether or not the key currently exists in the dictionary: 1: // try add attempts an add and returns false if it already exists 2: if (dictionary.TryAdd("G", 7)) 3: Console.WriteLine("G did not exist, now inserted with 7"); 4: else 5: Console.WriteLine("G already existed, insert failed."); TryRemove() also has the virtue of returning the value portion of the removed entry matching the given key: 1: // attempt to remove the value, if it exists it is removed and the original is returned 2: int removedValue; 3: if (dictionary.TryRemove("C", out removedValue)) 4: Console.WriteLine("Removed C and its value was " + removedValue); 5: else 6: Console.WriteLine("C did not exist, remove failed."); Now TryUpdate() is an interesting creature. You might think from it’s name that TryUpdate() first checks for an item’s existence, and then updates if the item exists, otherwise it returns false. Well, note quite... It turns out when you call TryUpdate() on a concurrent dictionary, you pass it not only the new value you want it to have, but also the value you expected it to have before the update. If the item exists in the dictionary, and it has the value you expected, it will update it to the new value atomically and return true. If the item is not in the dictionary or does not have the value you expected, it is not modified and false is returned. 1: // attempt to update the value, if it exists and if it has the expected original value 2: if (dictionary.TryUpdate("G", 42, 7)) 3: Console.WriteLine("G existed and was 7, now it's 42."); 4: else 5: Console.WriteLine("G either didn't exist, or wasn't 7."); The composite Add methods The ConcurrentDictionary also has composite add methods that can be used to perform updates and gets, with an add if the item is not existing at the time of the update or get. The first of these, AddOrUpdate(), allows you to add a new item to the dictionary if it doesn’t exist, or update the existing item if it does. For example, let’s say you are creating a dictionary of counts of stock ticker symbols you’ve subscribed to from a market data feed: 1: public sealed class SubscriptionManager 2: { 3: private readonly ConcurrentDictionary<string, int> _subscriptions = new ConcurrentDictionary<string, int>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public void AddSubscription(string tickerKey) 7: { 8: // add a new subscription with count of 1, or update existing count by 1 if exists 9: var resultCount = _subscriptions.AddOrUpdate(tickerKey, 1, (symbol, count) => count + 1); 10: 11: // now check the result to see if we just incremented the count, or inserted first count 12: if (resultCount == 1) 13: { 14: // subscribe to symbol... 15: } 16: } 17: } Notice the update value factory Func delegate. If the key does not exist in the dictionary, the add value is used (in this case 1 representing the first subscription for this symbol), but if the key already exists, it passes the key and current value to the update delegate which computes the new value to be stored in the dictionary. The return result of this operation is the value used (in our case: 1 if added, existing value + 1 if updated). Likewise, the GetOrAdd() allows you to attempt to retrieve a value from the dictionary, and if the value does not currently exist in the dictionary it will insert a value. This can be handy in cases where perhaps you wish to cache data, and thus you would query the cache to see if the item exists, and if it doesn’t you would put the item into the cache for the first time: 1: public sealed class PriceCache 2: { 3: private readonly ConcurrentDictionary<string, double> _cache = new ConcurrentDictionary<string, double>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public double QueryPrice(string tickerKey) 7: { 8: // check for the price in the cache, if it doesn't exist it will call the delegate to create value. 9: return _cache.GetOrAdd(tickerKey, symbol => GetCurrentPrice(symbol)); 10: } 11: 12: private double GetCurrentPrice(string tickerKey) 13: { 14: // do code to calculate actual true price. 15: } 16: } There are other variations of these two methods which vary whether a value is provided or a factory delegate, but otherwise they work much the same. Oddities with the composite Add methods The AddOrUpdate() and GetOrAdd() methods are totally thread-safe, on this you may rely, but they are not atomic. It is important to note that the methods that use delegates execute those delegates outside of the lock. This was done intentionally so that a user delegate (of which the ConcurrentDictionary has no control of course) does not take too long and lock out other threads. This is not necessarily an issue, per se, but it is something you must consider in your design. The main thing to consider is that your delegate may get called to generate an item, but that item may not be the one returned! Consider this scenario: A calls GetOrAdd and sees that the key does not currently exist, so it calls the delegate. Now thread B also calls GetOrAdd and also sees that the key does not currently exist, and for whatever reason in this race condition it’s delegate completes first and it adds its new value to the dictionary. Now A is done and goes to get the lock, and now sees that the item now exists. In this case even though it called the delegate to create the item, it will pitch it because an item arrived between the time it attempted to create one and it attempted to add it. Let’s illustrate, assume this totally contrived example program which has a dictionary of char to int. And in this dictionary we want to store a char and it’s ordinal (that is, A = 1, B = 2, etc). So for our value generator, we will simply increment the previous value in a thread-safe way (perhaps using Interlocked): 1: public static class Program 2: { 3: private static int _nextNumber = 0; 4: 5: // the holder of the char to ordinal 6: private static ConcurrentDictionary<char, int> _dictionary 7: = new ConcurrentDictionary<char, int>(); 8: 9: // get the next id value 10: public static int NextId 11: { 12: get { return Interlocked.Increment(ref _nextNumber); } 13: } Then, we add a method that will perform our insert: 1: public static void Inserter() 2: { 3: for (int i = 0; i < 26; i++) 4: { 5: _dictionary.GetOrAdd((char)('A' + i), key => NextId); 6: } 7: } Finally, we run our test by starting two tasks to do this work and get the results… 1: public static void Main() 2: { 3: // 3 tasks attempting to get/insert 4: var tasks = new List<Task> 5: { 6: new Task(Inserter), 7: new Task(Inserter) 8: }; 9: 10: tasks.ForEach(t => t.Start()); 11: Task.WaitAll(tasks.ToArray()); 12: 13: foreach (var pair in _dictionary.OrderBy(p => p.Key)) 14: { 15: Console.WriteLine(pair.Key + ":" + pair.Value); 16: } 17: } If you run this with only one task, you get the expected A:1, B:2, ..., Z:26. But running this in parallel you will get something a bit more complex. My run netted these results: 1: A:1 2: B:3 3: C:4 4: D:5 5: E:6 6: F:7 7: G:8 8: H:9 9: I:10 10: J:11 11: K:12 12: L:13 13: M:14 14: N:15 15: O:16 16: P:17 17: Q:18 18: R:19 19: S:20 20: T:21 21: U:22 22: V:23 23: W:24 24: X:25 25: Y:26 26: Z:27 Notice that B is 3? This is most likely because both threads attempted to call GetOrAdd() at roughly the same time and both saw that B did not exist, thus they both called the generator and one thread got back 2 and the other got back 3. However, only one of those threads can get the lock at a time for the actual insert, and thus the one that generated the 3 won and the 3 was inserted and the 2 got discarded. This is why on these methods your factory delegates should be careful not to have any logic that would be unsafe if the value they generate will be pitched in favor of another item generated at roughly the same time. As such, it is probably a good idea to keep those generators as stateless as possible. Summary The ConcurrentDictionary is a very efficient and thread-safe version of the Dictionary generic collection. It has all the benefits of type-safety that it’s generic collection counterpart does, and in addition is extremely efficient especially when there are more reads than writes concurrently. Tweet Technorati Tags: C#, .NET, Concurrent Collections, Collections, Little Wonders, Black Rabbit Coder,James Michael Hare

Read the article

Building a Data Mart with Pentaho Data Integration Video Review by Diethard Steiner, Packt Publishing

- by Compudicted

Originally posted on: http://geekswithblogs.net/Compudicted/archive/2014/06/01/building-a-data-mart-with-pentaho-data-integration-video-review.aspx The Building a Data Mart with Pentaho Data Integration Video by Diethard Steiner from Packt Publishing is more than just a course on how to use Pentaho Data Integration, it also implements and uses the principals of the Data Warehousing (and I even heard the name of Ralph Kimball in the video). Indeed, a video watcher should be familiar with its concepts as the Star Schema, Slowly Changing Dimension types, etc. so I suggest prior to watching this course to consider skimming through the Data Warehouse concepts (if unfamiliar) or even better, read the excellent Ralph’s The Data Warehouse Tooolkit. By the way, the author expands beyond using Pentaho along to MySQL and MonetDB which is a real icing on the cake! Indeed, I even suggest the name of the course should be ‘Building a Data Warehouse with Pentaho’. To successfully complete the course one needs to know some Linux (Ubuntu used in the course), the VI editor and the Bash command shell, but it seems that similar requirements would also apply to the Weindows OS. Additionally, knowing some basic SQL would not hurt. As I had said, MonetDB is used in this course several times which seems to be not anymore complex than say MySQL, but based on what I read is very well suited for fast querying big volumes of data thanks to having a columnstore (vertical data storage). I don’t see what else can be a barrier, the material is very digestible. On this note, I must add that the author does not cover how to acquire the software, so here is what I found may help: Pentaho: the free Community Edition must be more than anyone needs to learn it. Or even go into a POC. MonetDB can be downloaded (exists for both, Linux and Windows) from http://goo.gl/FYxMy0 (just see the appropriate link on the left). The author seems to be using Eclipse to run SQL code, one can get it from http://goo.gl/5CcuN. To create, or edit database entities and/or schema otherwise one can use a universal tool called SQuirreL, get it from http://squirrel-sql.sourceforge.net. Next, I must confess Diethard is very knowledgeable in what he does and beyond. However, there will be some accent heard to the user of the course especially if one’s mother tongue language is English, but it I got over it in a few chapters. I liked the rate at which the material is being presented, it makes me feel I paid for every second Eventually, my impressions are: Pentaho is an awesome ETL offering, it is worth learning it very much (I am an ETL fan and a heavy user of SSIS) MonetDB is nice, it tickles my fancy to know it more Data Warehousing, despite all the BigData tool offerings (Hive, Scoop, Pig on Hadoop), using the traditional tools still rocks Chapters 2 to 6 were the most fun to me with chapter 8 being the most difficult. In terms of closing, I highly recommend this video to anyone who needs to grasp Pentaho concepts quick, likewise, the course is very well suited for any developer on a “supposed to be done yesterday” type of a project. It is for a beginner to intermediate level ETL/DW developer. But one would need to learn more on Data Warehousing and Pentaho, for such I recommend the 5 star Pentaho Data Integration 4 Cookbook. Enjoy it! Disclaimer: I received this video from the publisher for the purpose of a public review.

Read the article

Building a Data Mart with Pentaho Data Integration Video Review by Diethard Steiner, Packt Publishing

- by Compudicted

Originally posted on: http://geekswithblogs.net/Compudicted/archive/2014/06/01/building-a-data-mart-with-pentaho-data-integration-video-review-again.aspx The Building a Data Mart with Pentaho Data Integration Video by Diethard Steiner from Packt Publishing is more than just a course on how to use Pentaho Data Integration, it also implements and uses the principals of the Data Warehousing (and I even heard the name of Ralph Kimball in the video). Indeed, a video watcher should be familiar with its concepts as the Star Schema, Slowly Changing Dimension types, etc. so I suggest prior to watching this course to consider skimming through the Data Warehouse concepts (if unfamiliar) or even better, read the excellent Ralph’s The Data Warehouse Tooolkit. By the way, the author expands beyond using Pentaho along to MySQL and MonetDB which is a real icing on the cake! Indeed, I even suggest the name of the course should be ‘Building a Data Warehouse with Pentaho’. To successfully complete the course one needs to know some Linux (Ubuntu used in the course), the VI editor and the Bash command shell, but it seems that similar requirements would also apply to the Windows OS. Additionally, knowing some basic SQL would not hurt. As I had said, MonetDB is used in this course several times which seems to be not anymore complex than say MySQL, but based on what I read is very well suited for fast querying big volumes of data thanks to having a columnstore (vertical data storage). I don’t see what else can be a barrier, the material is very digestible. On this note, I must add that the author does not cover how to acquire the software, so here is what I found may help: Pentaho: the free Community Edition must be more than anyone needs to learn it. Or even go into a POC. MonetDB can be downloaded (exists for both, Linux and Windows) from http://goo.gl/FYxMy0 (just see the appropriate link on the left). The author seems to be using Eclipse to run SQL code, one can get it from http://goo.gl/5CcuN. To create, or edit database entities and/or schema otherwise one can use a universal tool called SQuirreL, get it from http://squirrel-sql.sourceforge.net. Next, I must confess Diethard is very knowledgeable in what he does and beyond. However, there will be some accent heard to the user of the course especially if one’s mother tongue language is English, but it I got over it in a few chapters. I liked the rate at which the material is being presented, it makes me feel I paid for every second Eventually, my impressions are: Pentaho is an awesome ETL offering, it is worth learning it very much (I am an ETL fan and a heavy user of SSIS) MonetDB is nice, it tickles my fancy to know it more Data Warehousing, despite all the BigData tool offerings (Hive, Scoop, Pig on Hadoop), using the traditional tools still rocks Chapters 2 to 6 were the most fun to me with chapter 8 being the most difficult. In terms of closing, I highly recommend this video to anyone who needs to grasp Pentaho concepts quick, likewise, the course is very well suited for any developer on a “supposed to be done yesterday” type of a project. It is for a beginner to intermediate level ETL/DW developer. But one would need to learn more on Data Warehousing and Pentaho, for such I recommend the 5 star Pentaho Data Integration 4 Cookbook. Enjoy it! Disclaimer: I received this video from the publisher for the purpose of a public review.

Read the article

Internal Mutation of Persistent Data Structures

- by Greg Ros

To clarify, when I mean use the terms persistent and immutable on a data structure, I mean that: The state of the data structure remains unchanged for its lifetime. It always holds the same data, and the same operations always produce the same results. The data structure allows Add, Remove, and similar methods that return new objects of its kind, modified as instructed, that may or may not share some of the data of the original object. However, while a data structure may seem to the user as persistent, it may do other things under the hood. To be sure, all data structures are, internally, at least somewhere, based on mutable storage. If I were to base a persistent vector on an array, and copy it whenever Add is invoked, it would still be persistent, as long as I modify only locally created arrays. However, sometimes, you can greatly increase performance by mutating a data structure under the hood. In more, say, insidious, dangerous, and destructive ways. Ways that might leave the abstraction untouched, not letting the user know anything has changed about the data structure, but being critical in the implementation level. For example, let's say that we have a class called ArrayVector implemented using an array. Whenever you invoke Add, you get a ArrayVector build on top of a newly allocated array that has an additional item. A sequence of such updates will involve n array copies and allocations. Here is an illustration: However, let's say we implement a lazy mechanism that stores all sorts of updates -- such as Add, Set, and others in a queue. In this case, each update requires constant time (adding an item to a queue), and no array allocation is involved. When a user tries to get an item in the array, all the queued modifications are applied under the hood, requiring a single array allocation and copy (since we know exactly what data the final array will hold, and how big it will be). Future get operations will be performed on an empty cache, so they will take a single operation. But in order to implement this, we need to 'switch' or mutate the internal array to the new one, and empty the cache -- a very dangerous action. However, considering that in many circumstances (most updates are going to occur in sequence, after all), this can save a lot of time and memory, it might be worth it -- you will need to ensure exclusive access to the internal state, of course. This isn't a question about the efficacy of such a data structure. It's a more general question. Is it ever acceptable to mutate the internal state of a supposedly persistent or immutable object in destructive and dangerous ways? Does performance justify it? Would you still be able to call it immutable? Oh, and could you implement this sort of laziness without mutating the data structure in the specified fashion?

Read the article

Sabre Manages Fast Data Growth with Oracle Data Integration Products

- by Irem Radzik

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;} Last year at OpenWorld we announced Sabre Holding as a winner of the Fusion Middleware Innovation Awards. The Sabre team did an excellent job at leveraging cutting edge technologies for managing rapid data growth and exponential scalability demands they have experienced in the travel industry. Today we announced the details and specific benefits of Sabre’s new real-time data integration solution in a press release. Please take a look if you haven’t seen it yet. Sabre Holdings Deploys Oracle Data Integrator and Oracle GoldenGate to Support Rapid Customer Growth There are 3 different areas of benefits Sabre achieved by using Oracle Data Integration products: Manages 7X increase in data sources for the enterprise data warehouse Reduced infrastructure complexity Decreased time to market for new products and services by 30 percent. This simply shows that using latest technologies helps the companies to innovate robust solutions against today’s key data management challenges. And the benefit of using a next generation data integration technology is not only seen in the IT operations, but also in the business side. A better data integration solution for the enterprise data warehouse delivered the platform they need to accelerate how they service their customers, improving their competitive advantage. Tomorrow I will give another great example of innovation with next generation data integration from Oracle. We will be discussing the Fusion Middleware Innovation Awards 2012 winners and their results with using Oracle’s data integration products.

Read the article

implementing dynamic query handler on historical data

- by user2390183

EDIT : Refined question to focus on the core issue Context: I have historical data about property (house) sales collected from various sources in a centralized/cloud data source (assume info collection is handled by a third party) Planning to develop an application to query and retrieve data from this centralized data source Example Queries: Simple : for given XYZ post code, what is average house price for 3 bed room house? Complex: What is estimated price for an house at "DD,Some Street,XYZ Post Code" (worked out from average values of historic data filtered by various characteristics of the house: house post code, no of bed rooms, total area, and other deeper insights like house building type, year of built, features)? In addition to average price, the application should support other property info ** maximum, or minimum price..etc and trend (graph) on a selected property attribute over a period of time**. Hence, the queries should not enforce the search based on a primary key or few fixed fields In other words, queries can be What is the change in 3 Bed Room house price (irrespective of location) over last 30 days? What kind of properties we can get for X price (irrespective of location or house type) The challenge I have is identifying the domain (BI/ Data Analytical or DB Design or DB Query Interface or DW related or something else) this problem (dynamic query on historic data) belong to, so that I can do further exploration My findings so far I could be wrong on the following, so please correct me if you think so I briefly read about BI/Data Analytics - I think it is heavy weight solution for my problem and has scalability issues. DB Design - As I understand RDBMS works well if you know Data model at design time. I am expecting attributes about property or other entity (user) that am going to bring in, would evolve quickly. hence maintenance would be an issue. As I am going to have multiple users executing query at same time, performance would be a bottleneck Other options like Graph DB (http://www.tinkerpop.com/) seems to be bit complex (they are good. but using those tools meant for generic purpose, make me think like assembly programming to solve my problem ) BigData related solution are to analyse data from multiple unrelated domains So, Any suggestion on the space this problem fit in ? (Especially if you have design/implementation experience of back-end for property listing or similar portals)

Read the article

I need some help creating a non-binary tree (or some other data structure that will better solve my problem)

- by EDO

I have about ten lists of numbers and some strings. Each list has about <= 30K lines. Each line on a list has a distinct number. I need to build an efficient way of finding all the lines in each list that has the same 'control' number (or key for dB guys) and comparing what is in their string parts. I am writing this in Java. I have thought about using trees but my brain cells are about burnt now. I need some help.

Read the article

AngularJS dealing with large data sets (Strategy)

- by Brian

I am working on developing a personal temperature logging viewer based on my rasppi curl'ing data into my web server's api. Temperatures are taken every 2 seconds and I can have several temperature sensors posting data. Needless to say I will have a lot of data to handle even within the scope of an hour. I have implemented a very simple paging api from the server so the server doesn't timeout and is currently only returning data in 1000 units per call, then paging through the data. I had the idea to intially show say the last 20 minutes of data from a sensor (or all sensors depending on user choices), then allowing the user to select other timeframes from which to show data. The issue comes in when you want to view all sensors or an extended time period (say 24 hours). Is there a best practice of handling this large amount of data? Would it be useful to load those first 20 minutes into the live view and then cache into local storage something like the last 24 hours? I haven't been able to find a decent idea of this in use yet even though there are a lot of ways to take this problem. I am just looking for some suggestions as to what might provide a good balance between good performance and not caching the entire data set on the client side (as beyond a week of data this might not be feasible).

Read the article

replacing data.frame element-wise operations with data.table (that used rowname)

- by Harold

So lets say I have the following data.frames: df1 <- data.frame(y = 1:10, z = rnorm(10), row.names = letters[1:10]) df2 <- data.frame(y = c(rep(2, 5), rep(5, 5)), z = rnorm(10), row.names = letters[1:10]) And perhaps the "equivalent" data.tables: dt1 <- data.table(x = rownames(df1), df1, key = 'x') dt2 <- data.table(x = rownames(df2), df2, key = 'x') If I want to do element-wise operations between df1 and df2, they look something like dfRes <- df1 / df2 And rownames() is preserved: R> head(dfRes) y z a 0.5 3.1405463 b 1.0 1.2925200 c 1.5 1.4137930 d 2.0 -0.5532855 e 2.5 -0.0998303 f 1.2 -1.6236294 My poor understanding of data.table says the same operation should look like this: dtRes <- dt1[, !'x', with = F] / dt2[, !'x', with = F] dtRes[, x := dt1[,x,]] setkey(dtRes, x) (setkey optional) Is there a more data.table-esque way of doing this? As a slightly related aside, more generally, I would have other columns such as factors in each data.table and I would like to omit those columns while doing the element-wise operations, but still have them in the result. Does this make sense? Thanks!

Read the article

Getting a key in the ActionScript Dictionary

- by Rudy

Hello, I am using a Dictionary in ActionScript as a queue, sort of, still reading most of the time as an associative container, but I need one time to make a loop to run through the whole dictionary, such as for (var key:String in queue) . Inside this for loop I perform some actions on an element and then call delete on that key. My issue is that I would like to wait for an Event before fetching the next element in this queue. Basically my for loop runs too fast. I would like to fetch the next key all the time, but I know there is no built in method. A solution I thought is to add a break to the loop, as the for.. in will automatically fetch the next key, but it would be a loop which always executes one time, simply to fetch the next key. This sounds a bit counter intuitive. I hope my problem makes sense and I really look for some better ideas than what I currently have. Thanks for your help! Rudy

Read the article

Python: Access dictionary value inside of tuple and sort quickly by dict value

- by Aquat33nfan

I know that wasn't clear. Here's what I'm doing specifically. I have my list of dictionaries here: dict = [{int=0, value=A}, {int=1, value=B}, ... n] and I want to take them in combinations, so I used itertools and it gave me a tuple (Well, okay it gave me a memory object that I then used enumerate on so I could loop over it and enumerate gave ma tuple): for (index, tuple) in enumerate(combinations(dict, 2)): and this is where I have my problem. I want to identify which of the two items in the combination has the bigger 'int' value and which has the smaller value and assign them to variables (I'm actually using more than 2 in the combination so I can't just say if tuple[0]['int'] tuple[1]['int'] and do the assignment because I'd have to list this out a bunch of times and that's hard to manage). I was going to assign each 'int' value to a variable, sort it in a list, index the 'int' value in the list by 1, 2, 3, 4, 5 ... etc., then go back and access the dictionary I wanted by the int value and then assign the dictionary to a variable so I knew which was bigger. But I have a big list and lists and variable assignments are resource intensive and this is taking a long time (I had only a little bit of that written and it was taking forever to run). So I was hoping someone knew a fast way to do this. I actually could list out every possible combination of assignmnets using the if/thens but it's just like 5 pages of if/thens and assignments and is hard to read and manage when I want to change it. You've probably gathered this, but I"m new at programming. thx

Search Results

Search found 59969 results on 2399 pages for 'data dictionary'.

Page 10/2399 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >

- by Jordan Messina

- by Gershwyn

- by AP257

- by abatishchev

- by Maudise

- by LarrikJ

- by Bo Tian

- by Julien Testut

- by Justin

- by aCuria

- by Greg K

- by Tony Ouk

- by Highly Irregular

- by Thanos Terentes Printzios

- by James Michael Hare

- by Compudicted

- by Compudicted

- by Greg Ros

- by Irem Radzik

- by user2390183

- by EDO

- by Brian

- by Harold

- by Rudy

- by Aquat33nfan

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17 | Next Page >