Search Results

Search found 2515 results on 101 pages for 'distributed filesystems'.

Page 3/101 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Do your filesystems have un-owned files ?

    - by darrenm
    As part of our work for integrated compliance reporting in Solaris we plan to provide a check for determining if the system has "un-owned files", ie those which are owned by a uid that does not exist in our configured nameservice.  Tests such as this already exist in the Solaris CIS Benchmark (9.24 Find Un-owned Files and Directories) and other security benchmarks. The obvious method of doing this would be using find(1) with the -nouser flag.  However that requires we bring into memory the metadata for every single file and directory in every local file system we have mounted.  That is probaby not an acceptable thing to do on a production system that has a large amount of storage and it is potentially going to take a long time. Just as I went to bed last night an idea for a much faster way of listing file systems that have un-owned files came to me. I've now implemented it and I'm happy to report it works very well and peforms many orders of magnatude better than using find(1) ever will.   ZFS (since pool version 15) has per user space accounting and quotas.  We can report very quickly and without actually reading any files at all how much space any given user id is using on a ZFS filesystem.  Using that information we can implement a check to very quickly list which filesystems contain un-owned files. First a few caveats because the output data won't be exactly the same as what you get with find but it answers the same basic question.  This only works for ZFS and it will only tell you which filesystems have files owned by unknown users not the actual files.  If you really want to know what the files are (ie to give them an owner) you still have to run find(1).  However it has the huge advantage that it doesn't use find(1) so it won't be dragging the metadata for every single file and directory on the system into memory. It also has the advantage that it can check filesystems that are not mounted currently (which find(1) can't do). It ran in about 4 seconds on a system with 300 ZFS datasets from 2 pools totalling about 3.2T of allocated space, and that includes the uid lookups and output. #!/bin/sh for fs in $(zfs list -H -o name -t filesystem -r rpool) ; do unknowns="" for uid in $(zfs userspace -Hipn -o name,used $fs | cut -f1); do if [ -z "$(getent passwd $uid)" ]; then unknowns="$unknowns$uid " fi done if [ ! -z "$unknowns" ]; then mountpoint=$(zfs list -H -o mountpoint $fs) mounted=$(zfs list -H -o mounted $fs) echo "ZFS File system $fs mounted ($mounted) on $mountpoint \c" echo "has files owned by unknown user ids: $unknowns"; fi done Sample output: ZFS File system rpool/ROOT/solaris-30/var mounted (no) on /var has files owned by unknown user ids: 6435 33667 101 ZFS File system rpool/ROOT/solaris-32/var mounted (yes) on /var has files owned by unknown user ids: 6435 33667ZFS File system builds/bob mounted (yes) on /builds/bob has files owned by unknown user ids: 101 Note that the above might not actually appear exactly like that in any future Solaris product or feature, it is provided just as an example of what you can do with ZFS user space accounting to answer questions like the above.

    Read the article

  • For distributed applications, which to use, ASIO vs. MPI?

    - by Rhubarb
    I am a bit confused about this. If you're building a distributed application, which in some cases may perform parallel operations (although not necessarily mathematical), should you use ASIO or something like MPI? I take it MPI is a higher level than ASIO, but it's not clear where in the stack one would begin.

    Read the article

  • ideas for a distributed cache proxy server

    - by Neeraj
    Hi everyone! I am implementing, a distributed cache proxy server.I have an idea of the HTTP and related stuff, so i am rather concentrating on the sub part "Distributed data storage". From some search on web i found that this could be done using Distributed Hash Tables(DHT). I was wondering if there exists some kind of library for this preferably in C/C++. Any better suggestions for the same will also be appreciated.

    Read the article

  • Open source Distributed computing tool

    - by Prasenjit Chatterjee
    I want to set up distributed computing on my Local Area Network consisting a bunch of PCs. Say for the time being each one has the same OS - Windows 7. Is there any opensource tool available so that I can share the resources of these PCs over the LAN and increase the speed of my applications and the memory space. I know that if its a graphics intensive application then, it is not very practical, because the speed of LAN is much slower than Graphics processors. But I only want to share general applications, some basic softwares, Programming language IDEs etc. Can anyone shed some light on it? Thanks in Advance..

    Read the article

  • Is there a distributed project management software like Redmine?

    - by Tobias Kienzler
    I am quite familiar with and love using git, among other reasons due to its distributed nature. Now I'd like to set up some similarly distributed (FOSS) Project Management software with features similar to what Redmine offers, such as Issue & time tracking, milestones Gantt charts, calendar git integration, maybe some automatic linking of commits and issues Wiki (preferably with Mathjax support) Forum, news, notifications Multiple Projects However, I am looking for a solution that does not require a permanently accesible server, i.e. like in git, each user should have their own copy which can be easily synchronized with others. However it should be possible to not have a copy of every Project on every machine. Since trac uses multiple instances for multiple projects anyway, I was considering using that, but I neither know how well it adapts to simply giting the database itself (which would be be easiest way to handle the distribution due to git being used anyway), nor does it include all of Redmine's feature. So, can you recommend me a distributed project management software? If your suggestion is a software that usually runs on a server please include a description of the distribution method (e.g. whether simply putting the data in a git repository would do the trick), and if it's e.g. trac, please mention plugins required to include the features mentioned.

    Read the article

  • Hibernating and booting into another OS: will my filesystems be corrupted?

    - by Ryan Thompson
    Suppose I have Windows and Linux installed on the same computer. If I hibernate Windows, can I boot into Linux without corrupting the Windows filesystem when I resume Windows? What about the other way around? What if I hibernate one, boot into the other, and mount the hibernated filesystem read/write? Read-only? If this is unsafe, is there any way to detect the hibernated state of the other OS and prevent mounting its filesystem? Basically, how far can I push this before it breaks, and how dangerous is it near the edge? I think I know the answers to some of the above questions, but for other ones, I have no idea, and for obvious reasons I have not tested this on my own computer. If someone has tested these, please enlighten the rest of us. I'm not necessarily looking for a specific answer to every question; I'll accept any response that answers a reasonable portion. EDIT: Let me clarify that when I say "hibernate," I mean the process of writing the contents of RAM to the hard disk and completely powering down the computer. In this state, powering the computer back on brings you through the BIOS and bootloader again, and you could theoretically select another operating system on a multi-boot system. Anyway, on with the original question: RESULTS Ok, after everyone's assurances that this would work, I tested it for myself. I set up Ubuntu to remount all ntfs filesystems and external drives read-only before hibernating. There was no need for a similar Windows setup because Windows does not read Linux filesystems. Then, I tried alternately hibernating one operating system and resuming the other, back and forth a few times. I even tried mounting the Windows filesystem from Ubuntu read-write, and creating a few files. Windows didn't complain when I resumed. So, in conclusion, you can more or less freely hibernate in a dual-boot Windows/Linux scenario. Note that I did not test a dual Linux/Linux co-hibernation situation. If you have two or more Linux installs and you hibernate one of them, you might be able to corrupt the filesystem by mounting it from another.

    Read the article

  • Use a custom value object or a Guid as an entity identifier in a distributed system?

    - by Kazark
    tl;dr I've been told that in domain-driven design, an identifier for an entity could be a custom value object, i.e. something other than Guid, string, int, etc. Can this really be advisable in a distributed system? Long version I will invent an situation analogous to the one I am currently facing. Say I have a distributed system in which a central concept is an egg. The system allows you to order eggs and see spending reports and inventory-centric data such as quantity on hand, usage, valuation and what have you. There area variety of services backing these behaviors. And say there is also another app which allows you to compose recipes that link to a particular egg type. Now egg type is broken down by the species—ostrich, goose, duck, chicken, quail. This is fine and dandy because it means that users don't end up with ostrich eggs when they wanted quail eggs and whatnot. However, we've been getting complaints because jumbo chicken eggs are not even close to equivalent to small ones. The price is different, and they really aren't substitutable in recipes. And here we thought we were doing users a favor by not overwhelming them with too many options. Currently each of the services (say, OrderSubmitter, EggTypeDefiner, SpendingReportsGenerator, InventoryTracker, RecipeCreator, RecipeTracker, or whatever) are identifying egg types with an industry-standard integer representation the species (let's call it speciesCode). We realize we've goofed up because this change could effect every service. There are two basic proposed solutions: Use a predefined identifier type like Guid as the eggTypeID throughout all the services, but make EggTypeDefiner the only service that knows that this maps to a speciesCode and eggSizeCode (and potentially to an isOrganic flag in the future, or whatever). Use an EggTypeID value object which is a combination of speciesCode and eggSizeCode in every service. I've proposed the first solution because I'm hoping it better encapsulates the definition of what an egg type is in the EggTypeDefiner and will be more resilient to changes, say if some people now want to differentiate eggs by whether or not they are "organic". The second solution is being suggested by some people who understand DDD better than I do in the hopes that less enrichment and lookup will be necessary that way, with the justification that in DDD using a value object as an ID is fine. Also, they are saying that EggTypeDefiner is not a domain and EggType is not an entity and as such should not have a Guid for an ID. However, I'm not sure the second solution is viable. This "value object" is going to have to be serialized into JSON and URLs for GET requests and used with a variety of technologies (C#, JavaScript...) which breaks encapsulation and thus removes any behavior of the identifier value object (is either of the fields optional? etc.) Is this a case where we want to avoid something that would normally be fine in DDD because we are trying to do DDD in a distributed fashion? Summary Can it be a good idea to use a custom value object as an identifier in a distributed system (solution #2)?

    Read the article

  • Using Openfire for distributed XMPP-based video-chat

    - by Yitzhak
    I have been tasked with setting up a distributed video-chat system built on XMPP. Currently my setup looks like this: Openfire (XMPP server) + JingleNodes plugin for video chat OpenLDAP (LDAP server) for storing user information and allowing directory queries Kerberos server for authentication and passwords In testing with one set of machines (i.e. only three), everything works as expected: I can log in to Openfire and it looks up the user information in the OpenLDAP database, which in turn authenticates my user with Kerberos. Now, I want to have several clusters, so that there is a cluster on each continent. A typical cluster will probably contain 2-5 servers. Users logging in will be directed to the closest cluster based on geographical location. Something that concerns me particularly is the dynamic maintenance of contact lists. If a user is using a machine in Asia, for example, how would contact lists be updated around the world to reflect the current server he is using? How would that work with LDAP? Specific questions: How do I direct users based on geographical location? What is the best architecture for a cluster? -- would all traffic need to come into a load-balancer on each one, for example? How do I manage the update of contact lists across all these servers? In general, how do I go about setting this up? What are the pitfalls in doing this? I am inexperienced in this area, so any advice and suggestions would be appreciated.

    Read the article

  • Distributed and/or Parallel SSIS processing

    - by Jeff
    Background: Our company hosts SaaS DSS applications, where clients provide us data Daily and/or Weekly, which we process & merge into their existing database. During business hours, load in the servers are pretty minimal as it's mostly users running simple pre-defined queries via the website, or running drill-through reports that mostly hit the SSAS OLAP cube. I manage the IT Operations Team, and so far this has presented an interesting "scaling" issue for us. For our daily-refreshed clients, the server is only "busy" for about 4-6 hrs at night. For our weekly-refresh clients, the server is only "busy" for maybe 8-10 hrs per week! We've done our best to use some simple methods of distributing the load by spreading the daily clients evenly among the servers such that we're not trying to process daily clients back-to-back over night. But long-term this scaling strategy creates two notable issues. First, it's going to consume a pretty immense amount of hardware that sits idle for large periods of time. Second, it takes significant Production Support over-head to basically "schedule" the ETL such that they don't over-lap, and move clients/schedules around if they out-grow the resources on a particular server or allocated time-slot. As the title would imply, one option we've tried is running multiple SSIS packages in parallel, but in most cases this has yielded VERY inconsistent results. The most common failures are DTExec, SQL, and SSAS fighting for physical memory and throwing out-of-memory errors, and ETLs running 3,4,5x longer than expected. So from my practical experience thus far, it seems like running multiple ETL packages on the same hardware isn't a good idea, but I can't be the first person that doesn't want to scale multiple ETLs around manual scheduling, and sequential processing. One option we've considered is virtualizing the servers, which obviously doesn't give you any additional resources, but moves the resource contention onto the hypervisor, which (from my experience) seems to manage simultaneous CPU/RAM/Disk I/O a little more gracefully than letting DTExec, SQL, and SSAS battle it out within Windows. Question to the forum: So my question to the forum is, are we missing something obvious here? Are there tools out there that can help manage running multiple SSIS packages on the same hardware? Would it be more "efficient" in terms of parallel execution if instead of running DTExec, SQL, and SSAS same machine (with every machine running that configuration), we run in pairs of three machines with SSIS running on one machine, SQL on another, and SSAS on a third? Obviously that would only make sense if we could process more than the three ETL we were able to process on the machine independently. Another option we've considered is completely re-architecting our SSIS package to have one "master" package for all clients that attempts to intelligently chose a server based off how "busy" it already is in terms of CPU/Memory/Disk utilization, but that would be a herculean effort, and seems like we're trying to reinvent something that you would think someone would sell (although I haven't had any luck finding it). So in summary, are we missing an obvious solution for this, and does anyone know if any tools (for free or for purchase, doesn't matter) that facilitate running multiple SSIS ETL packages in parallel and on multiple servers? (What I would call a "queue & node based" system, but that's not an official term). Ultimately VMWare's Distributed Resource Scheduler addresses this as you simply run a consistent number of clients per VM that you know will never conflict scheduleing-wise, then leave it up to VMWare to move the VMs around to balance out hardware usage. I'm definitely not against using VMWare to do this, but since we're a 100% Microsoft app stack, it seems like -someone- out there would have solved this problem at the application layer instead of the hypervisor layer by checking on resource utilization at the OS, SQL, SSAS levels. I'm open to ANY discussion on this, and remember no suggestion is too crazy or radical! :-) Right now, VMWare is the only option we've found to get away from "manually" balancing our resources, so any suggestions that leave us on a pure Microsoft stack would be great. Thanks guys, Jeff

    Read the article

  • Running Awk command on a cluster

    - by alex
    How do you execute a Unix shell command (awk script, a pipe etc) on a cluster in parallel (step 1) and collect the results back to a central node (step 2) Hadoop seems to be a huge overkill with its 600k LOC and its performance is terrible (takes minutes just to initialize the job) i don't need shared memory, or - something like MPI/openMP as i dont need to synchronize or share anything, don't need a distributed VM or anything as complex Google's SawZall seems to work only with Google proprietary MapReduce API some distributed shell packages i found failed to compile, but there must be a simple way to run a data-centric batch job on a cluster, something as close as possible to native OS, may be using unix RPC calls i liked rsync simplicity but it seem to update remote notes sequentially, and you cant use it for executing scripts as afar as i know switching to Plan 9 or some other network oriented OS looks like another overkill i'm looking for a simple, distributed way to run awk scripts or similar - as close as possible to data with a minimal initialization overhead, in a nothing-shared, nothing-synchronized fashion Thanks Alex

    Read the article

  • Alternative to distributed caching

    - by Chen
    Hi, There is a technical requirement to scale a new system easily. This new system consists of three tiered applications (as a batch processors). Each tier will contains at least 2 servers with the same application resides on each server. So, when one of the tier reaches peak performance, we could extend the scalability easily by adding a new server and the same application to off-load some of the processing loads. The problem is that one or two of the three tiers require heavy caching (about 3 million records and increasing). I'm thinking of using distributed caching system to overcome this problem but the new distributed caching system will means an additional point of failure as applications now need to interact with additional caching systems for processing. I'm currently looking at ncache but just wondering if there is an alternatives to this problem? or is there any other comparable distributed caching system that maybe similar or better than ncache and provide enterprise supports too? Thanks, Chen

    Read the article

  • Is there a secure p2p distributed database?

    - by p2pgirl
    I'm looking for a distributed hash table to store and retrieve values securely. These are my requirements: It must use an existing popular p2p network (I must guarantee my key/value will be stored and kept in multiple peers). None but myself should be able to edit or delete the key/value. Ideally an encryption key that only I have access to would be required to edit my key value. All peers would be able to read the key value (read-only access, only the key holder would be able to edit the value) Is there such p2p distributed hash table? Would the bittorrent distributed hash table meet my requirements?' Where could I find documentation?

    Read the article

  • How are distributed services better than distributed objects?

    - by Gabriel Šcerbák
    I am not interested in the technology e.g. CORBA vs Web Services, I am interested in principles. When we are doing OOP, why should we have something so procedural at higher level? Is not it the same as with OOP and relational databases? Often services are supported through code generation, apart from boilerplate, I think it is because we new SOM - service object mapper. So again, what are the reasons for wervices rather than objects?

    Read the article

  • Distributed, Parallel, Fault-tolerant File System

    - by Eddified
    There are so many choices that it's hard to know where to start. My requirements are these: Runs on Linux Most of the files will be between 5-9 MB in size. There will also be a significant number of small-ish jpgs (100px x 100px). All of the files need to be available over http. Redundancy -- ideally it would provide the space efficiency similar to RAID 5 of 75% (in RAID 5 this would be calculated thus: with 4 identical disks, 25% of the space is used for parity = 75% efficent) Must support several petabytes of data scalable runs on commodity hardware In addition, I look for these qualities, though they are not "requirements": Stable, mature file system Lots of momentum and support etc I would like some input as to which file system works best for the given requirements. Some people at my organization are leaning towards MogileFS, but I'm not convinced of the stability and momentum of that project. GlusterFS and Lustre, based on my limited research, appear to be better supported... Thoughts?

    Read the article

  • Web application design with distributed servers

    - by Bonn
    I want to build a web application/server with this structure: main-server sub-server transaction-server (create, update, delete) view-server (view, search) authentication-server documents-server reporting-server library-server e-learning-server The main-server acts as host server for sub-server. I can add many sub-servers and connect it to main-server (via plug-play interface maybe), then it can begin querying data from another sub-servers (which has been connected to the main-server). The sub-servers can be anywhere as long as connected to internet. The main-server can manage all sub-servers which are connected to it (query data, setting permission between sub-servers, etc). The purpose is simple, the web application will be huge as the company grows, so I want to distribute it into small connected plug-able servers. My question is, does the structure above already have a standardized method? or are there any different views? what are the technologies needed? I need a lot of researches before the execution plan begin. thanks a lot.

    Read the article

  • Can Foswiki be used as a distributed Redmine replacement? [closed]

    - by Tobias Kienzler
    I am quite familiar with and love using git, among other reasons due to its distributed nature. Now I'd like to set up some similarly distributed (FOSS) Project Management software with features similar to what Redmine offers, such as Issue & time tracking, milestones Gantt charts, calendar git integration, maybe some automatic linking of commits and issues Wiki (preferably with Mathjax support) Forum, news, notifications Multiple Projects However, I am looking for a solution that does not require a permanently accesible server, i.e. like in git, each user should have their own copy which can be easily synchronized with others. However it should be possible to not have a copy of every Project on every machine. Since trac uses multiple instances for multiple projects anyway, I was considering using that, but I neither know how well it adapts to simply giting the database itself (which would be be easiest way to handle the distribution due to git being used anyway), nor does it include all of Redmine's feature. After checking http://www.wikimatrix.org for Wikis with integrated tracking system and RCS support, and filtering out seemingly stale project, the choices basically boil down to Foswiki, TWiki and Ikiwiki. The latter doesn't seem to offer as many usability features, and in the TWiki vs Foswiki issue I tend to the latter. Finally, there is Fossil, which starts from the other end by attempting to replace git entirely and tracking itself. I am however not too comfortable with the thought of replacing git, and Fossil's non-SCM features don't seem to be as developed. Now before I invest too much time when someone else might already have tried this, I basically have two questions: Are there crucial features of Project Management software like Redmine that Foswiki does not provide even with all the extensions available? How to set Foswiki up to use git instead of the perl RcsLite?

    Read the article

  • topics in distributed systems

    - by scatman
    what do you think is an interesting topic in distributed systems. i should pic a topic and present it on monday. at first i chose to talk about Wuala, but after reading about it, i don't think its that interesting. so what is an interesting (new) topic in distributed systems that i can research about. sorry if this is the wrong place to post.

    Read the article

  • DRY Authenticated Tasks in Cocoa (with distributed objects)

    - by arbales
    I'm kind of surprise/infuriated that the only way for me to run an authenticated task, like perhaps sudo gem install shi*t, is to make a tool with pre-written code. I'm writing a MacRuby application, which doesn't seem to expose the KAuthorization* constants/methods. So.. I learned Cocoa and Objective-C. My application creates a object, serves it and calls the a tool that elevates itself and then performs a selector on a distributed object (in the tool's thread). I hoped that the distributed object's methods would evaluated inside the tool, so I could use delegation to create "privileged" tasks. If this won't work, don't try to save it, I just want a DRY/cocoa solution. AuthHelper.m //AuthorizationExecuteWithPrivileges of this. AuthResponder* my_responder = [AuthResponder sharedResponder]; // Gets the proxy object (and it's delegate) NSString *selector = [NSString stringWithUTF8String:argv[3]]; NSLog(@"Performing selector: %@", selector); setuid(0); if ([[my_responder delegate] respondsToSelector:NSSelectorFromString(selector)]){ [[my_responder delegate] performSelectorOnMainThread:NSSelectorFromString(selector) withObject:nil waitUntilDone:YES]; } RandomController.m - (void)awakeFromNib { helperToolPath = [[[NSBundle mainBundle] resourcePath] stringByAppendingString:@"/AuthHelper"]; delegatePath = [[[NSBundle mainBundle] resourcePath] stringByAppendingString:@"/ABExtensions.rb"]; AuthResponder* my_responder = [AuthResponder initAsService]; [my_responder setDelegate:self]; } -(oneway void)install_gems{ NSArray *args = [NSArray arrayWithObjects: @"gem", @"install", @"sinatra", nil]; [NSTask launchedTaskWithLaunchPath:@"/usr/bin/sudo" arguments:args]; NSLog(@"Ran AuthResponder.delegate.install_gems"); // This prints. } ... other privileges tasks. "sudo gem update --system" for one. I'm guessing the proxy object is performing the selector in it's own thread, but I want the current (privileged thread) to do it so I can use sudo. Can I force the distributed object to evaluate the selector on the tool's thread? How else can I accomplish this dryly/cocoaly?

    Read the article

  • Auto-mount filesystems on boot fails (12.10)

    - by Joshua Pruitt
    I have a Compaq HP 8200 slim desktop running 12.10 with encrypted partitions (set up with the text-based installer). Everything's working fine, except... When I boot the computer, my /boot and /boot/efi directories refuse to mount automatically. I'm dropped to the root console, where I must enter 'mountall -v', and everything then continues on just fine. This was happening under 12.04. I've recently upgraded to 12.10, and the problem persists. Except now, in addition to /boot and /boot/efi not mounting, roughly 50% of the time /var will not be auto-mounted as well (and again, 'mountall -v' fixes allows me to boot and move on). I'm puzzled about this one. Running 'fsck' doesn't seem to do anything (the filesystems aren't damaged anyway). What can I try to solve this issue? Here's my /etc/fstab: http://paste.ubuntu.com/1338508/ Thanks in advance!!! Addendum: I have tried changing the entries in fstab from UUIDs to the actual devices, to no avail.

    Read the article

  • What is lightweight lock in distributed shared memory systems?

    - by Kutluhan Metin
    I started reading Tanenbaum's Distributed Systems book a while ago. I read about two phase locking and timestamp reordering in transactions chapter. While having a deeper look from google I heard of lightweight transactions/lightweight transactional memory. But I couldn't find any good explanation and implementation. So what is lightweight memory? What are the benefits of lightweight locks? And how can I implement them?

    Read the article

  • Where do you earn more money (Autonomous Systems vs Distributed Systems)? [closed]

    - by Puckl
    I am interested in both topics and I can choose between them for my computer science master. I think the distributed systems master focuses more on software technologies and the autononmous systems master is focused on robotics and machine learning. Do you get good jobs in the fild of machine learning without a Ph.D.? I guess there are more jobs available in the Software-Tech world, is this right? Where do you earn more money? (It is not the only criteria, but it matters)

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >