Search Results

Search found 1795 results on 72 pages for 'veritas cluster'.

Page 57/72 | < Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >

Parallelism in Python

- by fmark

What are the options for achieving parallelism in Python? I want to perform a bunch of CPU bound calculations over some very large rasters, and would like to parallelise them. Coming from a C background, I am familiar with three approaches to parallelism: Message passing processes, possibly distributed across a cluster, e.g. MPI. Explicit shared memory parallelism, either using pthreads or fork(), pipe(), et. al Implicit shared memory parallelism, using OpenMP. Deciding on an approach to use is an exercise in trade-offs. In Python, what approaches are available and what are their characteristics? Is there a clusterable MPI clone? What are the preferred ways of achieving shared memory parallelism? I have heard reference to problems with the GIL, as well as references to tasklets. In short, what do I need to know about the different parallelization strategies in Python before choosing between them?

Read the article
Windows/.NET Load Distribution & Balancing

- by andrewbadera

Hi all, Is there a vetted Windows-friendly, or even .NET-native, load-distributing/load-balancing utility out there along the lines of HA Proxy? We have a .NET stack product, and the one piece that we step out of the stack is for load-balancing. We need something with configurable rules for distribution -- perhaps subdomain-driven -- that NLB alone doesn't seem to offer. If it integrates directly with .NET, or offers an exposed API callable by webservices, so much the better! Thanks in advance! Clarification: we need to logically part over boxes. This is not just a cluster/failover/replication scenario.

Read the article
Hadoop on Amazon EC2 : Job tracker not starting properly

- by Algorist

Hi, We are running Hadoop on Amazon EC2 cluster. We start the master, slaves and attach the ebs volumes and finally waiting for hadoop jobtracker, tasktracker etc to start and we have timeout of 3600 seconds. We are noticing 50% of the time that job tracker is not able to start before the timeout. Reason being, hdfs is not initialized properly and still in safemode and job tracker is unable to start. I noticed few connectivity issues between nodes on EC2 as I tried manually pinging slaves. Did anyone face similar issue and know how to solve this? Thank you Bala

Read the article
Clojure / HBase: How to Import HBaseTestingUtility in v0.94.6.1

- by David Williams

In Clojure, if I want to start a test cluster using the hbase testing utility, I have to annotate my dependencies with: [org.apache.hbase/hbase "0.92.2" :classifier "tests" :scope "test"] First of all, I have no idea what this means. According to leiningens sample project.clj ;; Dependencies are listed as [group-id/name version]; in addition ;; to keywords supported by Pomegranate, you can use :native-prefix ;; to specify a prefix. This prefix is used to extract natives in ;; jars that don't adhere to the default "<os>/<arch>/" layout that ;; Leiningen expects. Question 1: What does that mean? Question 2: If I upgrade the version: [org.apache.hbase/hbase "0.94.6.1" :classifier "tests" :scope "test"] Then I receive a ClassNotFoundException Exception in thread "main" java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration Whats going on here and how do I fix it?

Read the article
MPAPI vs MPI.NET vs ?

- by Olexandr

I'm working on college project. I have to develop distributed computing system. And i decided to do some research to make this task fun :) I've found MPAPI and MPI.NET libraries. Yes, they are .NET libraries(Mono, in my case). Why .NET ? I'm choosing between Ada, C++ and C# so to i've choosed C# because of lower development time. I have two goals: Simplicity; Performance; Cluster computing. So, what to choose - MPAPI or MPI.NET or something else ?

Read the article
java.io.IOException: Invalid argument

- by Luixv

Hi I have a web application running in cluster mode with a load balancer. It consists in two tomcats (T1, and T2) addressing only one DB. T2 is nfs mounted to T1. This is the only dofference between both nodes. I have a java method generating some files. If the request runs on T1 there is no problem but if the request is running on node 2 I get an exception as follows: java.io.IOException: Invalid argument at java.io.FileOutputStream.close0(Native Method) at java.io.FileOutputStream.close(FileOutputStream.java:279) The corresponding code is as follows: for (int i = 0; i < dataFileList.size(); i++) { outputFileName = outputFolder + fileNameList.get(i); FileOutputStream fileOut = new FileOutputStream(outputFileName); fileOut.write(dataFileList.get(i), 0, dataFileList.get(i).length); fileOut.flush(); fileOut.close(); } The exception appears at the fileOut.close() Any hint? Luis

Read the article
DynaCache invalidation in clustered environment

- by Ravi

We are using Horizontal cluster in our PROD with WAS 6.1 as the Application server. We have enabled dynacache service for some of the JSP fragments using SHARED-PUSH in cachespec.xml file. Now we want to do cache invalidation programmatically..ie. whenever something changes in DB related to cache the cache should get invalidated. so can you please let me know what steps are involved in it to achieve this? any configuration settings at server side or any development changes.

Read the article
Efficiency of while(true) ServerSocket Listen

- by Submerged

I am wondering if a typical while(true) ServerSocket listen loop takes an entire core to wait and accept a client connection (Even when implementing runnable and using Thread .start()) I am implementing a type of distributed computing cluster and each computer needs every core it has for computation. A Master node needs to communicate with these computers (invoking static methods that modify the algorithm's functioning). The reason I need to use sockets is due to the cross platform / cross language capabilities. In some cases, PHP will be invoking these java static methods. I used a java profiler (YourKit) and I can see my running ServerSocket listen thread and it never sleeps and it's always running. Is there a better approach to do what I want? Or, will the performance hit be negligible? Please, feel free to offer any suggestion if you can think of a better way (I've tried RMI, but it isn't supported cross-language. Thanks everyone

Read the article
ssh script gives "key_read" error

- by lugte098

I'm using a script that connects to a cluster through ssh and sends some commands, then quits the connection. This script basically connects once using ssh, then executes a script in this session. This script loops through a list of commands a few times and after it is finished, the connection is terminated. So this script works fine, except for the fact that after a few loops it gives me the following error at loop 22. And then again at loop 32. The loops do exactly the same thing, so i cannot grasp the problem the script is facing. This is the error: key_read: uudecode AAAAB3NzaC1yc2EAAAABIwAAAQEAxmNx2hcXLpTjuaa3yKC3B9gbF7KprP2/ CH8fBgMbCyIcOB+ZMQDmEnbVTqedBwV/mxjZzorEpHTM8MX2WsTjFsxwzDgcpuxm+3cwfb0WSy9Y4Kb F8crAsRDbBIpUZ2n/iSdRcds9nTjk6PA61kTS24RLACHpqF18vudlO5WcbCOnAwa+DdUs0Raw29UiQc BaC6M4YPnApq9Ayy7a6qFI2uK6efkwfLTZIDivWlIdLpRLEyuBEpozQQhEd0mrGhR/ Gl1GevRvFMms14130xQ4A5UpJSn6CmrRIWBkcgp1TilqDGQ1F5xZOinnc4C00gFrbT3hkkQqY5A9p node023,10.141.0.31 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAxmNx2hcXLpTjuaa3yKC3 B9gbF7KprP2/CH8fBgMbCyIcOB+ZMQDmEnbVTqedBwV/mxjZzorEpHTM8MX2WsTjFsxwzDgcpuxm+ 3cwfb0WSy9Y4KbF8crAsRDbBIpUZ2n/iSdRcds9nTjk6PA61kTS24RLACHpqF18vudlO5WcbCOnAw a+DdUs0Raw29UiQcBaC6M4YPnApq9Ayy7a6qFI2uK6efkwfLTZIDivWlIdLpRLEyuBEpozQQhEd0m rGhR/Gl1GevRvFMms14130xQ4A5UpJSn6CmrRIWBkcgp1TilqDGQ1F5xZOinnc4C00gFrbT3hkkQqY5 A9pa0lQHFkSw==

Read the article
Is AMQP suitable as both an intra and inter-machine software bus?

- by Bwooce

I'm trying to get my head around AMQP. It looks great for inter-machine (cluster, LAN, WAN) communication between applications but I'm not sure if it is suitable (in architectural, and current implementation terms) for use as a software bus within one machine. Would it be worth pulling out a current high performance message passing framework to replace it with AMQP, or is this falling into the same trap as RPC by blurring the distinction between local and non-local communication? I'm also wary of the performance impacts of using a WAN technology for intra-machine communications, although this may be more of an implementation concern than architecture. War stories would be appreciated.

Read the article
SQL Server 2008 BULK INSERT causes more reads than writes. Why?

- by sh1ng

I've huge a table (a few billion rows) with a clustered index and two non-clustered indices. A BULK INSERT operation produces 112000 reads and only 383 writes (duration 19948ms). It's very confusing to me. Why do reads exceed writes? How can I reduce it? update query insert bulk DenormalizedPrice4 ([DP_ID] BigInt, [DP_CountryID] Int, [DP_OperatorID] SmallInt, [DP_OperatorPriceID] BigInt, [DP_SpoID] Int, [DP_TourTypeID] Int, [DP_CheckinDate] Date, [DP_CurrencyID] SmallInt, [DP_Cost] Decimal(9,2), [DP_FirstCityID] Int, [DP_FirstHotelID] Int, [DP_FirstBuildingID] Int, [DP_FirstHotelGlobalStarID] Int, [DP_FirstHotelGlobalMealID] Int, [DP_FirstHotelAccommodationTypeID] Int, [DP_FirstHotelRoomCategoryID] Int, [DP_FirstHotelRoomTypeID] Int, [DP_Days] TinyInt, [DP_Nights] TinyInt, [DP_ChildrenCount] TinyInt, [DP_AdultsCount] TinyInt, [DP_TariffID] Int, [DP_DepartureCityID] Int, [DP_DateCreated] SmallDateTime, [DP_DateDenormalized] SmallDateTime, [DP_IsHide] Bit, [DP_FirstHotelAccommodationID] Int) with (CHECK_CONSTRAINTS) No triggers & foreign keys Cluster Index by DP_ID and two non-unique indexes(with fillfactor=90%) And one more thing DB stored on RAID50 with stripe size 256K

Read the article
EJB3.1 Remote invocation - is it distributed automatically? is it expensive?

- by Hank

I'm building a JEE6 application with performance and scalability in the forefront of my mind. Business logic and JPA2-facade is held in stateless session beans (EJB3.1). As of right now, the SLSBs implement only @Remote-interfaces. When a bean needs to access another bean, it does so via RMI. My reasoning behind this is the assumption that, once the application runs on a bunch of clustered application servers, the RMI-part allows the execution to be distributed across the whole cluster automagically. Is that a correct assumption? I'm fine with dealing with the downsides of that (objects lose entityManager session, pass-by-value), at least I think so. But I am wondering if constant remote invocation isn't adding more load then necessary.

Read the article
Orbital equations, and power required to run them

- by Adam Davis

Due to a discussion on the SO IRC today, I'm curious about orbital mechanics, and The equations needed to solve orbital problems The computing power required to solve complex problems The question in particular is calculating when the Earth will plow into the Sun (or vice versa, depending on the frame of reference). I suspect that all the gravitational pulls within our solar system may need to be calculated, which makes me wonder what type of computer cluster is required, or can this be done on a single box? I don't have the experience to do a back of the napkin test here, but perhaps you do? Also, much thx to Gortok for the original inspiration (see comments).

Read the article
Genetic programming in c++, library suggestions?

- by shuttle87

I'm looking to add some genetic algorithms to an Operations research project I have been involved in. Currently we have a program that aids in optimizing some scheduling and we want to add in some heuristics in the form of genetic algorithms. Are there any good libraries for generic genetic programming/algorithms in c++? Or would you recommend I just code my own? I should add that while I am not new to c++ I am fairly new to doing this sort of mathematical optimization work in c++ as the group I worked with previously had tended to use a proprietary optimization package. We have a fitness function that is fairly computationally intensive to evaluate and we have a cluster to run this on so parallelized code is highly desirable. So is c++ a good language for this? If not please recommend some other ones as I am willing to learn another language if it makes life easier. thanks!

Read the article
Illegal instruction gcc assembler.

- by Bernt

In assembler: .globl _test _test: pushl %ebp movl %esp, %ebp movl $0, %eax pushl %eax popl %ebp ret Calling from c main() { _test(); } Compile: gcc -m32 -o test test.c test.s This code gives me illegal instruction sometimes and segment fault other times. In gdc i always get illegal instruction, this is just a simple test, i had a larger program that was working and suddenly after no apperant reason stopped working, now i always get this error even if i start from scratch like above. I have narrowed it down to pushl %eax (or any other register....), if i comment out that line the code runs fine. Any ideas? (I'm running the program at my universities linux cluster, so I have not changed any settings..)

Read the article
Separation of static and dynamic content in Java EE applications

- by Dan

We work with IBM products and we typically use IBM Http Servers (read Apache) as a reverse proxy for our application servers. For performance reasons we serve static content (.gif, .jpg, .css, .html etc.) from our http servers, to ease the burden a bit from the application server. So far, we have to distribute files to http server and configure it manually (writing custom scripts at best.) The problem is the effort needed to keep everything in synch, especially when you need to update the app. Does any Java EE product support this “out of the box”? Is there a way to have application server do this automatically, like in cluster configuration for example, where master node is in charge of distributing the application to other nodes and for keeping everything in synch.

Read the article
How to tell process id within Python

- by R S

Hey, I am working with a cluster system over linux (www.mosix.org) that allows me to run jobs and have the system run them on different computers. Jobs are run like so: mosrun ls & This will naturally create the process and run it on the background, returning the process id, like so: [1] 29199 Later it will return. I am writing a Python infrastructure that would run jobs and control them. For that I want to run jobs using the mosrun program as above, and save the process ID of the spawned process (29199 in this case). This naturally cannot be done using os.system or commands.getoutput, as the printed ID is not what the process prints to output... Any clues? Edit: Since the python script is only meant to initially run the script, the scripts need to run longer than the python shell. I guess it means the mosrun process cannot be the script's "son process". Any suggestions? Thanks

Read the article
Cassandra performance slow down with counter column

- by tubcvt

I have a cluster (4 node ) and a node have 16 core and 24 gb ram: 192.168.23.114 datacenter1 rack1 Up Normal 44.48 GB 25.00% 192.168.23.115 datacenter1 rack1 Up Normal 44.51 GB 25.00% 192.168.23.116 datacenter1 rack1 Up Normal 44.51 GB 25.00% 192.168.23.117 datacenter1 rack1 Up Normal 44.51 GB 25.00% We use about 10 column family (counter column) to make some system statistic report. Problem on here is that When i set replication_factor of this keyspace from 1 to 2 (contain 10 counter column family ), all cpu of node increase from 10% ( when use replication factor=1) to --- 90%. :( :( who can help me work around that :( . why counter column consume too much cpu time :(. thanks all

Read the article
Should I go with Varnish instead of nginx?

- by gotts

I really like nginx. But recently I've found that varnish gives you an opportunity to implement smart caching revers proxy layer(with URL purging). I have a cluster of mongrels which are pretty resource-intensive so if this caching layer can remove some load from mongrels this can be a great thing. I didn't find a way to implement the caching layer(with for application pages; static content is cacheable of course) same with nginx.. Should I use Varnish instead? What would you recommend?

Read the article
What are some good ways to do intermachine locking?

- by mike

Our server cluster consists of 20 machines, each with 10 pids of 5 threads. We'd like some way to prevent any two threads, in any pid, on any machine, from modifying the same object at the same time. Our code's written in Python and runs on Linux, if that helps narrow things down. Also, it's a pretty rare case that two such threads want to do this, so we'd prefer something that optimizes the "only one thread needs this object" case to be really fast, even if it means that the "one thread has locked this object and another one needs it" case isn't great. What are some of the best practices?

Read the article
tomcat session replication without multicast

- by Andreas Petersson

i am planning to use 2 dedicated root servers rented at a hosting provider. those machines will run tomcat 6 in a cluster. if i will add additional machines later on - it is unlikely that they will be accessible with multicast, because they will be located in different subnets. is it possible to run tomcat without multicast? all tutorials for tomcat 6 clustering include multicast heartbeat. are there any alternatives to SimpleTcpCluster? or are other alternatives more appropriate in this situation?

Read the article
Algorithm for redirecting the traffic

- by TechGeeky

I was going through the interview questions and found out the below question which I am not able to answer it. Can anyone provide some sort of algorithm for this problem how can I solve it? There are a cluster of stateless servers all serving the same pages. The servers are hosting 5 web pages- p1.html, p2.html, p3.html, p4.html and p5.html p1.html just redirects users to the other 4 pages Requests to p1.html should result in 10% of users being redirected to p2.html, 5% of users redirected to p3.html, 20% of users redirected to p4.html, and 65% of users redirected to p5.html. Users do not need to stick to the page they are first redirected to. They could end up on a different page with every request to p1.html Write a function/pseudocode that would be invoked with every request to p1.html and redirect the correct percentage of users to the correct page. Any suggestions will be of great help.

Read the article
Tool for response time analysis on JBoss server?

- by Ariel Vardi

I am running a pretty high traffic cluster of JBoss servers serving REST requests and I am interested in tools reading the access logs in Tomcat format (with %D parameter) to provide a detailed analysis of the response time on a per-call basis. Ideally this tool would generate a chart showing the progression of the response time throughout the day, hour per hour, then a weekly view with averages on the day, and monthly with average on the weeks (CACTI style). I've looked for such tools and couldn't find anything. Is any of you guys aware of something close to that before I start writing my own? I haven't looked into CACTI extensions yet, but that be an option?

Read the article
Can't find netbooted for Kerrighed pxe boot with Ubuntu Lucid Server

- by Pengin

I'm following installtion guides for pxe booting and kerrighed. I can't find the package nfsbooted for Ubuntu 10.04. Where did it go? Context: At work I have access to 8 mini-ITX PCs and am trying to build a cluster. My plans include trying Condor, GridGain, Hadoop, and recently Kerrighed has caught my eye. (I reaslise these are all for different kinds of things, I'm just evaluating). Ideally, I'd like to have all the nodes network boot from a single server, since that seems so much easier to manage, plus I can 'borrow' additional PCs for a while without touching their HD. I've been getting on great with Ubuntu Lucid Server (10.04), trying to follow the only guides I can find to get pxe booting (and ultimately kerrighed) to work. This guide is for Ubuntu 8.04 and this one is for Debian. They both refer to a package I can't seem to find, nfsbooted. Has this package been replaced? Am I doing something daft?

Read the article
k-means clustering in R on very large, sparse matrix?

- by movingabout

Hello, I am trying to do some k-means clustering on a very large matrix. The matrix is approximately 500000 rows x 4000 cols yet very sparse (only a couple of "1" values per row). The whole thing does not fit into memory, so I converted it into a sparse ARFF file. But R obviously can't read the sparse ARFF file format. I also have the data as a plain CSV file. Is there any package available in R for loading such sparse matrices efficiently? I'd then use the regular k-means algorithm from the cluster package to proceed. Many thanks

Read the article

< Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >