iterables - Developer IT

I don't like Python functions that take two or more iterables. Is it a good idea?

- by Xavier Ho

This question came from looking at this question on Stackoverflow. def fringe8((px, py), (x1, y1, x2, y2)): Personally, it's been one of my pet peeves to see a function that takes two arguments with fixed-number iterables (like a tuple) or two or more dictionaries (Like in the Shotgun API). It's just hard to use, because of all the verbosity and double-bracketed enclosures. Wouldn't this be better: >>> class Point(object): ... def __init__(self, x, y): ... self.x = x ... self.y = y ... >>> class Rect(object): ... def __init__(self, x1, y1, x2, y2): ... self.x1 = x1 ... self.y1 = y1 ... self.x2 = x2 ... self.y2 = y2 ... >>> def fringe8(point, rect): ... # ... ... >>> >>> point = Point(2, 2) >>> rect = Rect(1, 1, 3, 3) >>> >>> fringe8(point, rect) Is there a situation where taking two or more iterable arguments is justified? Obviously the standard itertools Python library needs that, but I can't see it being pretty in maintainable, flexible code design.

Read the article

Iterables.find and Iterators.find - instead of throwing exception, get null

- by mjlee

I'm using google-collections and trying to find the first element that satisfies Predicate if not, return me 'null'. Unfortunately, Iterables.find and Iterators.find throws NoSuchElementException when no element is found. Now, I am forced to do Object found = null; if ( Iterators.any( newIterator(...) , my_predicate ) { found = Iterators.find( newIterator(...), my_predicate ) } I can surround by 'try/catch' and do the same thing but for my use-cases, I am going to encounter many cases where no-element is found. Is there a simpler way of doing this?

Read the article

Why does Iterator have a contains method but Iterable does not, in Scala 2.8?

- by adam77

I would like to call contains on my Iterables :-)

Read the article

Is there an equivalent in Scala to Python's more general map function?

- by wheaties

I know that Scala's Lists have a map implementation with signature (f: (A) => B):List[B] and a foreach implementation with signature (f: (A) => Unit):Unit but I'm looking for something that accepts multiple iterables the same way that the Python map accepts multiple iterables. I'm looking for something with a signature of (f: (A,B) => C, Iterable[A], Iterable[B] ):Iterable[C] or equivalent. Is there a library where this exists or a comparable way of doing similar?

Read the article

How to pick a chunksize for python multiprocessing with large datasets

- by Sandro

I am attempting to to use python to gain some performance on a task that can be highly parallelized using http://docs.python.org/library/multiprocessing. When looking at their library they say to use chunk size for very long iterables. Now, my iterable is not long, one of the dicts that it contains is huge: ~100000 entries, with tuples as keys and numpy arrays for values. How would I set the chunksize to handle this and how can I transfer this data quickly? Thank you.

Read the article

How to apply a function to a collection of elements

- by Cue

Consider I have an array of elements out of which I want to create a new 'iterable' which on every next applies a custom 'transformation'. What's the proper way of doing it under python 2.x? For people familiar with Java, the equivalent is Iterables#transform from google's collections framework.

Read the article

JavaScript: Is there any "python's Generator" equivalent in JavaScript?

- by JackSMTV

Is there any "python's Generator" equivalent in JavaScript? PS: Python's Generator is very memory efficient when we need to do one time iterate through a big array, hash... "Generators are iterables, but you can only read them once. It's because they do not store all the values in memory, they generate the values on the fly" (Python's Generator explained in this thread: The Python yield keyword explained )

Read the article

Passing a non-iterable to list.extend ()

- by JS

Hello, I am creating a public method to allow callers to write values to a device, call it write_vals() for example. Since these values will by typed live, I would like to simplify the user's life by allowing them type in either a list or a single value, depending on how many values they need to write. For example: write_to_device([1,2,3]) or write_to_device(1) My function would like to work with a flat list, so I tried to be clever and code something like this: input_list = [] input_list.extend( input_val ) This works swimmingly when the user inputs a list, but fails miserably when the user inputs a single integer: TypeError: 'int' object is not iterable Using list.append() would create a nested list when a list was passed in, which would be an additional hassle to flatten. Checking the type of the object passed in seems clumsy and non-pythonic and wishing that list.extend() would accept non-iterables has gotten me nowhere. So has trying a variety of other coding methods. Suggestions (coding-related only, please) would be greatly appreciated.

Read the article

Simulating C-style for loops in python

- by YGA

(even the title of this is going to cause flames, I realize) Python made the deliberate design choice to have the for loop use explicit iterables, with the benefit of considerably simplified code in most cases. However, sometimes it is quite a pain to construct an iterable if your test case and update function are complicated, and so I find myself writing the following while loops: val = START_VAL while <awkward/complicated test case>: # do stuff ... val = <awkward/complicated update> The problem with this is that the update is at the bottom of the while block, meaning that if I want to have a continue embedded somewhere in it I have to: use duplicate code for the complicated/awkard update, AND run the risk of forgetting it and having my code infinite loop I could go the route of hand-rolling a complicated iterator: def complicated_iterator(val): while <awkward/complicated test case>: yeild val val = <awkward/complicated update> for val in complicated_iterator(start_val): if <random check>: continue # no issues here # do stuff This strikes me as waaaaay too verbose and complicated. Do folks in stack overflow have a simpler suggestion?

Read the article

Library for Dataflow in C

- by msutherl

How can I do dataflow (pipes and filters, stream processing, flow based) in C? And not with UNIX pipes. I recently came across stream.py. Streams are iterables with a pipelining mechanism to enable data-flow programming and easy parallelization. The idea is to take the output of a function that turns an iterable into another iterable and plug that as the input of another such function. While you can already do this using function composition, this package provides an elegant notation for it by overloading the operator. I would like to duplicate a simple version of this kind of functionality in C. I particularly like the overloading of the operator to avoid function composition mess. Wikipedia points to this hint from a Usenet post in 1990. Why C? Because I would like to be able to do this on microcontrollers and in C extensions for other high level languages (Max, Pd*, Python). * (ironic given that Max and Pd were written, in C, specifically for this purpose – I'm looking for something barebones)

Read the article

parallel computation for an Iterator of elements in Java

- by Brian Harris

I've had the same need a few times now and wanted to get other thoughts on the right way to structure a solution. The need is to perform some operation on many elements on many threads without needing to have all elements in memory at once, just the ones under computation. As in, Iterables.partition is insufficient because it brings all elements into memory up front. Expressing it in code, I want to write a BulkCalc2 that does the same thing as BulkCalc1, just in parallel. Below is sample code that illustrates my best attempt. I'm not satisfied because it's big and ugly, but it does seem to accomplish my goals of keeping threads highly utilized until the work is done, propagating any exceptions during computation, and not having more than numThreads instances of BigThing necessarily in memory at once. I'll accept the answer which meets the stated goals in the most concise way, whether it's a way to improve my BulkCalc2 or a completely different solution. interface BigThing { int getId(); String getString(); } class Calc { // somewhat expensive computation double calc(BigThing bigThing) { Random r = new Random(bigThing.getString().hashCode()); double d = 0; for (int i = 0; i < 100000; i++) { d += r.nextDouble(); } return d; } } class BulkCalc1 { final Calc calc; public BulkCalc1(Calc calc) { this.calc = calc; } public TreeMap<Integer, Double> calc(Iterator<BigThing> in) { TreeMap<Integer, Double> results = Maps.newTreeMap(); while (in.hasNext()) { BigThing o = in.next(); results.put(o.getId(), calc.calc(o)); } return results; } } class SafeIterator<T> { final Iterator<T> in; SafeIterator(Iterator<T> in) { this.in = in; } synchronized T nextOrNull() { if (in.hasNext()) { return in.next(); } return null; } } class BulkCalc2 { final Calc calc; final int numThreads; public BulkCalc2(Calc calc, int numThreads) { this.calc = calc; this.numThreads = numThreads; } public TreeMap<Integer, Double> calc(Iterator<BigThing> in) { ExecutorService e = Executors.newFixedThreadPool(numThreads); List<Future<?>> futures = Lists.newLinkedList(); final Map<Integer, Double> results = new MapMaker().concurrencyLevel(numThreads).makeMap(); final SafeIterator<BigThing> it = new SafeIterator<BigThing>(in); for (int i = 0; i < numThreads; i++) { futures.add(e.submit(new Runnable() { @Override public void run() { while (true) { BigThing o = it.nextOrNull(); if (o == null) { return; } results.put(o.getId(), calc.calc(o)); } } })); } e.shutdown(); for (Future<?> future : futures) { try { future.get(); } catch (InterruptedException ex) { // swallowing is OK } catch (ExecutionException ex) { throw Throwables.propagate(ex.getCause()); } } return new TreeMap<Integer, Double>(results); } }

Read the article

CodePlex Daily Summary for Monday, January 31, 2011

CodePlex Daily Summary for Monday, January 31, 2011Popular ReleasesMVC Controls Toolkit: Mvc Controls Toolkit 0.8: Fixed the following bugs: *Variable name error in the jvascript file that prevented the use of the deleted item template of the Datagrid *Now after the changes applied to an item of the DataGrid are cancelled all input fields are reset to the very initial value they had. *Other minor bugs. Added: *This version is available both for MVC2, and MVC 3. The MVC 3 version has a release number of 0.85. This way one can install both version. *Client Validation support has been added to all control...Office Web.UI: Beta preview (Source): This is the first Beta. it includes full source code and all available controls. Some designers are not ready, and some features are not finalized allready (missing properties, draft styles) ThanksASP.net Ribbon: Version 2.2: This release brings some new controls (part of Office Web.UI). A few bugs are fixed and it includes the "auto resize" feature as you resize the window. (It can cause an infinite loop when the window is too reduced, it's why this release is not marked as "stable"). I will release more versions 2.3, 2.4... until V3 which will be the official launch of Office Web.UI. Both products will evolve at the same speed. Thanks.Barcode Rendering Framework: 2.1.1.0: Final release for VS2008 Finally fixed bugs with code 128 symbology.HERB.IQ: HERB.IQ.UPGRADE.0.5.3.exe: HERB.IQ.UPGRADE.0.5.3.exexUnit.net - Unit Testing for .NET: xUnit.net 1.7: xUnit.net release 1.7Build #1540 Important notes for Resharper users: Resharper support has been moved to the xUnit.net Contrib project. Important note for TestDriven.net users: If you are having issues running xUnit.net tests in TestDriven.net, especially on 64-bit Windows, we strongly recommend you upgrade to TD.NET version 3.0 or later. This release adds the following new features: Added support for ASP.NET MVC 3 Added Assert.Equal(double expected, double actual, int precision) Ad...DoddleReport - Automatic HTML/Excel/PDF Reporting: DoddleReport 1.0: DoddleReport will add automatic tabular-based reporting (HTML/PDF/Excel/etc) for any LINQ Query, IEnumerable, DataTable or SharePoint List For SharePoint integration please click Here PDF Reporting has been placed into a separate assembly because it requies AbcPdf http://www.websupergoo.com/download.htmSpark View Engine: Spark v1.5: Release Notes There have been a lot of minor changes going on since version 1.1, but most important to note are the major changes which include: Support for HTML5 "section" tag. Spark has now renamed its own section tag to "segment" instead to avoid clashes. You can still use "section" in a Spark sense for legacy support by specifying ParseSectionAsSegment = true if needed while you transition Bindings - this is a massive feature that further simplifies your views by giving you a powerful ...Marr DataMapper: Marr DataMapper 1.0.0 beta: First release.WPF Application Framework (WAF): WPF Application Framework (WAF) 2.0.0.3: Version: 2.0.0.3 (Milestone 3): This release contains the source code of the WPF Application Framework (WAF) and the sample applications. Requirements .NET Framework 4.0 (The package contains a solution file for Visual Studio 2010) The unit test projects require Visual Studio 2010 Professional Remark The sample applications are using Microsoft’s IoC container MEF. However, the WPF Application Framework (WAF) doesn’t force you to use the same IoC container in your application. You can use ...Rawr: Rawr 4.0.17 Beta: Rawr is now web-based. The link to use Rawr4 is: http://elitistjerks.com/rawr.phpThis is the Cataclysm Beta Release. More details can be found at the following link http://rawr.codeplex.com/Thread/View.aspx?ThreadId=237262 and on the Version Notes page: http://rawr.codeplex.com/wikipage?title=VersionNotes As of the 4.0.16 release, you can now also begin using the new Downloadable WPF version of Rawr!This is a pre-alpha release of the WPF version, there are likely to be a lot of issues. If you...Squiggle - A Free open source LAN Messenger: Squiggle 2.5 Beta: In this release following are the new features: Localization: Support for Arabic, French, German and Chinese (Simplified) Bridge: Connect two Squiggle nets across the WAN or different subnets Aliases: Special codes with special meaning can be embedded in message like (version),(datetime),(time),(date),(you),(me) Commands: cls, /exit, /offline, /online, /busy, /away, /main Sound notifications: Get audio alerts on contact online, message received, buzz Broadcast for group: You can ri...VivoSocial: VivoSocial 7.4.2: Version 7.4.2 of VivoSocial has been released. If you experienced any issues with the previous version, please update your modules to the 7.4.2 release and see if they persist. If you have any questions about this release, please post them in our Support forums. If you are experiencing a bug or would like to request a new feature, please submit it to our issue tracker. Web Controls * Updated Business Objects and added a new SQL Data Provider File. Groups * Fixed a security issue whe...PHP Manager for IIS: PHP Manager 1.1.1 for IIS 7: This is a minor release of PHP Manager for IIS 7. It contains all the functionality available in 56962 plus several bug fixes (see change list for more details). Also, this release includes Russian language support. SHA1 codes for the downloads are: PHPManagerForIIS-1.1.0-x86.msi - 6570B4A8AC8B5B776171C2BA0572C190F0900DE2 PHPManagerForIIS-1.1.0-x64.msi - 12EDE004EFEE57282EF11A8BAD1DC1ADFD66A654mojoPortal: 2.3.6.1: see release notes on mojoportal.com http://www.mojoportal.com/mojoportal-2361-released.aspx Note that we have separate deployment packages for .NET 3.5 and .NET 4.0 The deployment package downloads on this page are pre-compiled and ready for production deployment, they contain no C# source code. To download the source code see the Source Code Tab I recommend getting the latest source code using TortoiseHG, you can get the source code corresponding to this release here.Parallel Programming with Microsoft Visual C++: Drop 6 - Chapters 4 and 5: This is Drop 6. It includes: Drafts of the Preface, Introduction, Chapters 2-7, Appendix B & C and the glossary Sample code for chapters 2-7 and Appendix A & B. The new material we'd like feedback on is: Chapter 4 - Parallel Aggregation Chapter 5 - Futures The source code requires Visual Studio 2010 in order to run. There is a known bug in the A-Dash sample when the user attempts to cancel a parallel calculation. We are working to fix this.NodeXL: Network Overview, Discovery and Exploration for Excel: NodeXL Excel Template, version 1.0.1.160: The NodeXL Excel template displays a network graph using edge and vertex lists stored in an Excel 2007 or Excel 2010 workbook. What's NewThis release improves NodeXL's Twitter and Pajek features. See the Complete NodeXL Release History for details. Installation StepsFollow these steps to install and use the template: Download the Zip file. Unzip it into any folder. Use WinZip or a similar program, or just right-click the Zip file in Windows Explorer and select "Extract All." Close Ex...Kooboo CMS: Kooboo CMS 3.0 CTP: Files in this downloadkooboo_CMS.zip: The kooboo application files Content_DBProvider.zip: Additional content database implementation of MSSQL, RavenDB and SQLCE. Default is XML based database. To use them, copy the related dlls into web root bin folder and remove old content provider dlls. Content provider has the name like "Kooboo.CMS.Content.Persistence.SQLServer.dll" View_Engines.zip: Supports of Razor, webform and NVelocity view engine. Copy the dlls into web root bin folder to enable...UOB & ME: UOB ME 2.6: UOB ME 2.6????: ???? V1.0: ???? V1.0 ??New ProjectsAuto Complete Control for ASP.NET: Autocomplete Control is a fully functional ASP.NET control for word suggestions and autocomplete. We had been using Ajax Control Toolkit AutoComplete Extender in our projects before, but we have needed some extra features and functionalities.Cours ESIEE: MAJ des cours ESIEE depuis la plateforme Icampus et autres documentsEngineering World Expenses: Demo expenses application for Engineering World 2011Entity Framework / Linq to Sql Poco Code Generator: Poco Orm data access layer (Dto) code generator for Entity Framework and Linq to Sql. Customizable code generation via simple templating system. Utilizes Managed Extensibility Framework (MEF) in order for application parts to dynamically composed and plug-able.linqish.py: Python module for manipulating iterables. An implementation of the .Net Framework's Linq to Objects for Python.Machinekey setter: This code sample is Windows Azure SDK 1.3 custom plugin. This sample do working at set custom key to machinekey of web.config file in your WebRole.MapReduce.NET: MapReduce.NET intends to implement the original paper proposed by Google on MapReduce.Marr DataMapper: Marr DataMapper provides a fast and easy to use wrapper around ADO.NET that enables you to focus more on your data access queries without having to write plumbing code. Load one-to-one, one-to-many, and hierarchical entity models with ease. No special base class required.Orchard Silverlight: Orchard module enabling embedding Silverlight applications and creating Silverlight-based content.RouteMagic: Library of useful routing helpers and classes.Smart Skelta Utilites: Smart Skelta Utilies will provide utilties like Visual Studio 2008 Skelta Starter Kit(Project Templates and Project Item Templates),Code Snippets for Skelta Components,Skleta Attachment Extracter Web based Logger,Skelta Server utility and others for skelta based development.Solfix: Solfix is a programming language tbat is work-in-progress, but it has a lot of functionality! You can make applications for console to windows applications. The main point of Solfix is to make coding easier and less time than before.SQLite Manager: A minimal manage for sqlite databases.State Search: StateSearch provides state search algoritms such as A*, IDA*, BestFirst, etc to solve problems such as puzzles and/or path searchingTable Check Custom Field Type: SharePoint Custom Field Type for displaying a list of values with checkboxes and people editors.testsgb: testWindows Phone 7 Extension Framework: An extension method framework for Windows Phone 7 to make your code more fluent and adding a lot of common functions you don't need to reproduce.

Read the article

Solving embarassingly parallel problems using Python multiprocessing

- by gotgenes

How does one use multiprocessing to tackle embarrassingly parallel problems? Embarassingly parallel problems typically consist of three basic parts: Read input data (from a file, database, tcp connection, etc.). Run calculations on the input data, where each calculation is independent of any other calculation. Write results of calculations (to a file, database, tcp connection, etc.). We can parallelize the program in two dimensions: Part 2 can run on multiple cores, since each calculation is independent; order of processing doesn't matter. Each part can run independently. Part 1 can place data on an input queue, part 2 can pull data off the input queue and put results onto an output queue, and part 3 can pull results off the output queue and write them out. This seems a most basic pattern in concurrent programming, but I am still lost in trying to solve it, so let's write a canonical example to illustrate how this is done using multiprocessing. Here is the example problem: Given a CSV file with rows of integers as input, compute their sums. Separate the problem into three parts, which can all run in parallel: Process the input file into raw data (lists/iterables of integers) Calculate the sums of the data, in parallel Output the sums Below is traditional, single-process bound Python program which solves these three tasks: #!/usr/bin/env python # -*- coding: UTF-8 -*- # basicsums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file. """ import csv import optparse import sys def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) return cli_parser def parse_input_csv(csvfile): """Parses the input CSV and yields tuples with the index of the row as the first element, and the integers of the row as the second element. The index is zero-index based. :Parameters: - `csvfile`: a `csv.reader` instance """ for i, row in enumerate(csvfile): row = [int(entry) for entry in row] yield i, row def sum_rows(rows): """Yields a tuple with the index of each input list of integers as the first element, and the sum of the list of integers as the second element. The index is zero-index based. :Parameters: - `rows`: an iterable of tuples, with the index of the original row as the first element, and a list of integers as the second element """ for i, row in rows: yield i, sum(row) def write_results(csvfile, results): """Writes a series of results to an outfile, where the first column is the index of the original row of data, and the second column is the result of the calculation. The index is zero-index based. :Parameters: - `csvfile`: a `csv.writer` instance to which to write results - `results`: an iterable of tuples, with the index (zero-based) of the original row as the first element, and the calculated result from that row as the second element """ for result_row in results: csvfile.writerow(result_row) def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # gets an iterable of rows that's not yet evaluated input_rows = parse_input_csv(in_csvfile) # sends the rows iterable to sum_rows() for results iterable, but # still not evaluated result_rows = sum_rows(input_rows) # finally evaluation takes place as a chain in write_results() write_results(out_csvfile, result_rows) infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) Let's take this program and rewrite it to use multiprocessing to parallelize the three parts outlined above. Below is a skeleton of this new, parallelized program, that needs to be fleshed out to address the parts in the comments: #!/usr/bin/env python # -*- coding: UTF-8 -*- # multiproc_sums.py """A program that reads integer values from a CSV file and writes out their sums to another CSV file, using multiple processes if desired. """ import csv import multiprocessing import optparse import sys NUM_PROCS = multiprocessing.cpu_count() def make_cli_parser(): """Make the command line interface parser.""" usage = "\n\n".join(["python %prog INPUT_CSV OUTPUT_CSV", __doc__, """ ARGUMENTS: INPUT_CSV: an input CSV file with rows of numbers OUTPUT_CSV: an output file that will contain the sums\ """]) cli_parser = optparse.OptionParser(usage) cli_parser.add_option('-n', '--numprocs', type='int', default=NUM_PROCS, help="Number of processes to launch [DEFAULT: %default]") return cli_parser def main(argv): cli_parser = make_cli_parser() opts, args = cli_parser.parse_args(argv) if len(args) != 2: cli_parser.error("Please provide an input file and output file.") infile = open(args[0]) in_csvfile = csv.reader(infile) outfile = open(args[1], 'w') out_csvfile = csv.writer(outfile) # Parse the input file and add the parsed data to a queue for # processing, possibly chunking to decrease communication between # processes. # Process the parsed data as soon as any (chunks) appear on the # queue, using as many processes as allotted by the user # (opts.numprocs); place results on a queue for output. # # Terminate processes when the parser stops putting data in the # input queue. # Write the results to disk as soon as they appear on the output # queue. # Ensure all child processes have terminated. # Clean up files. infile.close() outfile.close() if __name__ == '__main__': main(sys.argv[1:]) These pieces of code, as well as another piece of code that can generate example CSV files for testing purposes, can be found on github. I would appreciate any insight here as to how you concurrency gurus would approach this problem. Here are some questions I had when thinking about this problem. Bonus points for addressing any/all: Should I have child processes for reading in the data and placing it into the queue, or can the main process do this without blocking until all input is read? Likewise, should I have a child process for writing the results out from the processed queue, or can the main process do this without having to wait for all the results? Should I use a processes pool for the sum operations? If yes, what method do I call on the pool to get it to start processing the results coming into the input queue, without blocking the input and output processes, too? apply_async()? map_async()? imap()? imap_unordered()? Suppose we didn't need to siphon off the input and output queues as data entered them, but could wait until all input was parsed and all results were calculated (e.g., because we know all the input and output will fit in system memory). Should we change the algorithm in any way (e.g., not run any processes concurrently with I/O)?

Search Results

Search found 13 results on 1 pages for 'iterables'.

Page 1/1 | 1

- by Xavier Ho

- by mjlee

- by adam77

- by wheaties

- by Sandro

- by Cue

- by JackSMTV

- by JS

- by YGA

- by msutherl

- by Brian Harris

- by gotgenes