Search Results

Search found 43200 results on 1728 pages for 'large object pattern'.

Page 154/1728 | < Previous Page | 150 151 152 153 154 155 156 157 158 159 160 161 | Next Page >

Getting started with massive data

- by Max

I'm a math guy and occasionally do some statistics/machine learning analysis consulting projects on the side. The data I have access to are usually on the smaller side, at most a couple hundred of megabytes (and almost always far less), but I want to learn more about handling and analyzing data on the gigabyte/terabyte scale. What do I need to know and what are some good resources to learn from? Hadoop/MapReduce is one obvious start. Is there a particular programming language I should pick up? (I primarily work now in Python, Ruby, R, and occasionally Java, but it seems like C and Clojure are often used for large-scale data analysis?) I'm not really familiar with the whole NoSQL movement, except that it's associated with big data. What's a good place to learn about it, and is there a particular implementation (Cassandra, CouchDB, etc.) I should get familiar with? Where can I learn about applying machine learning algorithms to huge amounts of data? My math background is mostly on the theory side, definitely not on the numerical or approximation side, and I'm guessing most of the standard ML algorithms don't really scale. Any other suggestions on things to learn would be great!

Read the article
XSLT fails to load huge XML doc after matching certain elements

- by krisvandenbergh

I'm trying to match certain elements using XSLT. My input document is very large and the source XML fails to load after processing the following code (consider especially the first line). <xsl:template match="XMI/XMI.content/Model_Management.Model/Foundation.Core.Namespace.ownedElement/Model_Management.Package/Foundation.Core.Namespace.ownedElement"> <rdf:RDF> <rdf:Description rdf:about=""> <xsl:for-each select="Foundation.Core.Class"> <xsl:for-each select="Foundation.Core.ModelElement.name"> <owl:Class rdf:ID="@Foundation.Core.ModelElement.name" /> </xsl:for-each> </xsl:for-each> </rdf:Description> </rdf:RDF> </xsl:template> Apparently the XSLT fails to load after "Model_Management.Model". The PHP code is as follows: if ($xml->loadXML($source_xml) == false) { die('Failed to load source XML: ' . $http_file); } It then fails to perform loadXML and immediately dies. I think there are two options now. 1) I should set a maximum executing time. Frankly, I don't know how that I do this for the built-in PHP 5 XSLT processor. 2) Think about another way to match. What would be the best way to deal with this? The input document can be found at http://krisvandenbergh.be/uml_pricing.xml Any help would be appreciated! Thanks.

Read the article
Copying a foreign Subversion repository to keep under dependencies

- by Jonathan Sternberg

I want to keep dependencies for my project in our own repository, that way we have consistent libraries for the entire team to work with. For example, I want our project to use the Boost libraries. I've seen this done in the past with putting dependencies under a "vendor" or "dependencies" folder. But I still want to be able to update these dependencies. If a new feature appears in a library and we need it, I want to just be able to update that repository within our own repository. I don't want to have to recopy it and put it under version control again. I'd also like for us to have the ability to change dependencies if a small change is needed without stopping us from ever updating the library. I want the ability to do something like 'svn cp', then be able to 'svn merge' in the future. I just tried this with the boost trunk, but I'm not able to get any history using 'svn log' on the copy I made. How do I do this? What is usually done for large projects with dependencies?

Read the article
Database structure for ecommerce site

- by imanc

Hey Guys, I have been tasked with designing an ecommerce solution. The aspect that is causing me the most problems is the database. Currently the site consists of 10+ country based shops each with their own database (all residing on the same mysql instance). For the new site I'd rather all these shop databases be merged into one database so that all tables (products, orders, customers etc.) have a shop_id field. From a programming perspective this seems to make the most sense as we won't have to manage data across multiple databases. Currently the entire site generates about 120k orders a year, but is experiencing fairly heavy growth and we need to design a solution that will scale. In 5 years there may be more than a million orders per year and a database that contains 5 years order history (archiving maybe a solution here). The question is - do we use a single database, or do we keep the database-per-shop structure? I am currently trying to find supporting evidence for either avenue. The company I am designing the solution for prefer the per-shop database structure because they believe it will allow the sites to scale. But my argument is that the shop's database probably won't get that busy over the next few years that they exceed the capacity of a mysql database and a "no expenses spared" hardware set-up. I am wondering if anyone has any advice either way? Does anyone have experience with websites / ecommerce sites that have tables containing millions of records? I know there is probably not a clear answer here, but at what stage do we have too many records or too large table files to have a fast loading site? Also, if anyone has any advice on sources of information - books, websites, etc. where I can do further research, it would be highly appreciated! Cheers, imanc

Read the article
Read/Write/Find/Replace huge csv file

- by notapipe

I have a huge (4,5 GB) csv file.. I need to perform basic cut and paste, replace operations for some columns.. the data is pretty well organized.. the only problem is I cannot play with it with Excel because of the size (2000 rows, 550000 columns). here is some part of the data: ID,Affection,Sex,DRB1_1,DRB1_2,SENum,SEStatus,AntiCCP,RFUW,rs3094315,rs12562034,rs3934834,rs9442372,rs3737728 D0024949,0,F,0101,0401,SS,yes,?,?,A_A,A_A,G_G,G_G D0024302,0,F,0101,7,SN,yes,?,?,A_A,G_G,A_G,?_? D0023151,0,F,0101,11,SN,yes,?,?,A_A,G_G,G_G,G_G I need to remove 4th, 5th, 6th, 7th, 8th and 9th columns; I need to find every _ character from column 10 onwards and replace it with a space ( ) character; I need to replace every ? with zero (0); I need to replace every comma with a tab; I need to remove first row (that has column names; I need to replace every 0 with 1, every 1 with 2 and every ? with 0 in 2nd column; I need to replace F with 2, M with 1 and ? with 0 in 3rd column; so that in the resulting file the output reads: D0024949 1 2 A A A A G G G G D0024302 1 2 A A G G A G 0 0 D0023151 1 2 A A G G G G G G (both input and output should read one line per row, ne extra blank row) Is there a memory efficient way of doing that with java(and I need a code to do that) or a usable tool for playing with this large data so that I can easily apply Excel functionality..

Read the article
Improving File Read Performance (single file, C++, Windows)

- by david

I have large (hundreds of MB or more) files that I need to read blocks from using C++ on Windows. Currently the relevant functions are: errorType LargeFile::read( void* data_out, __int64 start_position, __int64 size_bytes ) const { if( !m_open ) { // return error } else { seekPosition( start_position ); DWORD bytes_read; BOOL result = ReadFile( m_file, data_out, DWORD( size_bytes ), &bytes_read, NULL ); if( size_bytes != bytes_read || result != TRUE ) { // return error } } // return no error } void LargeFile::seekPosition( __int64 position ) const { LARGE_INTEGER target; target.QuadPart = LONGLONG( position ); SetFilePointerEx( m_file, target, NULL, FILE_BEGIN ); } The performance of the above does not seem to be very good. Reads are on 4K blocks of the file. Some reads are coherent, most are not. A couple questions: Is there a good way to profile the reads? What things might improve the performance? For example, would sector-aligning the data be useful? I'm relatively new to file i/o optimization, so suggestions or pointers to articles/tutorials would be helpful.

Read the article
MySQL - What is wrong with this query or my database? Terrible performance.

- by Moss

SELECT * from `employees` a LEFT JOIN (SELECT phone1 p1, count(*) c, FROM `employees` GROUP BY phone1) b ON a.phone1 = b.p1; I'm not sure if it is this query in particular that has the problem. I have been getting terrible performance in general with this database. The table in question has 120,000 rows. I have tried this particular query remotely and locally with the MyISAM and InnoDB engines, with different types of joins, and with and without an index on phone1. I can get this to complete in about 4 minutes on a 10,000 row table successfully but performance drops exponentially with larger tables. Remotely it will lose connection to the server and locally it brings my system to its knees and seems to go on forever. This query is only a smaller step I was trying to do when a larger query couldn't complete. Maybe I should explain the whole scenario. I have one big flat ugly table that lists a bunch of people and their contact info and the info of the companies they work for. I'm trying to normalize the database and intelligently determine which phone numbers apply to individual people and which apply to an office location. My reasoning is that if a phone number occurs multiple times and the number of occurrence equals the number of times that the street address it is attached to occurs then it must be an office number. So the first step is to count each phone number grouping by phone number. Normally if you just use COUNT()...GROUP BY it will only list the first record it finds in that group so I figured I have to join the full table to the count table where the phone number matches. This does work but as I said I can't successfully complete it on any table much larger than 10,000 rows. This seems pathetic and this doesn't seem like a crazy query to do. Is there a better way to achieve what I want or do I have to break my large table into 12 pieces or is there something wrong with the table or db?

Read the article
How to do role-based access control for a franchise business?

- by FreshCode

I'm building the 2nd iteration of a web-based CRM+CMS for a franchise service business in ASP.NET MVC 2. I need to control access to each franchise's services based on the roles a user is assigned for that franchise. 4 examples: Receptionist should be able to book service jobs in for her "Atlantic Seaboard" franchise, but not do any reporting. Technician should be able to alter service jobs, but not modify invoices. Managers should be able to apply discount to invoices for jobs within their stores. Owner should be able to pull reports for any franchises he owns. Where should franchise-level access control fit in between the Data - Services - Web layer? If it belongs in my Controllers, how should I best implement it? Partial Schema Roles class int ID { get; set; } // primary key for Role string Name { get; set; } Partial Franchises class short ID { get; set; } // primary key for Franchise string Slug { get; set; } // unique key for URL access, eg /{franchise}/{job} string Name { get; set; } UserRoles mapping short FranchiseID; // related to franchises table Guid UserID; // related to Users table int RoleID; // related to Roles table DateTime ValidFrom; DateTime ValidUntil; Background I built the previous CRM in classic ASP and it runs the business well, but it's time for an upgrade to speed up workflow and leave less room for error. For the sake of proper testing and better separation between data and presentation, I decided to implement the repository pattern as seen in Rob Conery's MVC Storefront series. Controller Implementation Access Control with [Authorize] attribute If there was just one franchise involved, I could simply limit access to a controller action like so: [Authorize(Roles="Receptionist, Technician, Manager, Owner")] public ActionResult CreateJob(Job job) { ... } And since franchises don't just pop up over night, perhaps this is a strong case to use the new Areas feature in ASP.NET MVC 2? Or would this lead to duplicate Views? Controllers, URL Routing & Areas Assuming Areas aren't used, what would be the best way to determine which franchise's data is being accessed? I thought of this: {franchise}/{controller}/{action}/{id} or is it better to determine a job's franchise in a Details(...) action and limit a user's action with [Authorize]: {job}/{id}/{action}/{subaction} {invoice}/{id}/{action}/{subaction} which makes more sense if any user could potentially have access to more than one franchise without cluttering the URL with a {franchise} parameter. Any input is appreciated.

Read the article
Visitor and templated virtual methods

- by Thomas Matthews

In a typical implementation of the Visitor pattern, the class must account for all variations (descendants) of the base class. There are many instances where the same method content in the visitor is applied to the different methods. A templated virtual method would be ideal in this case, but for now, this is not allowed. So, can templated methods be used to resolve virtual methods of the parent class? Given (the foundation): struct Visitor_Base; // Forward declaration. struct Base { virtual accept_visitor(Visitor_Base& visitor) = 0; }; // More forward declarations struct Base_Int; struct Base_Long; struct Base_Short; struct Base_UInt; struct Base_ULong; struct Base_UShort; struct Visitor_Base { virtual void operator()(Base_Int& b) = 0; virtual void operator()(Base_Long& b) = 0; virtual void operator()(Base_Short& b) = 0; virtual void operator()(Base_UInt& b) = 0; virtual void operator()(Base_ULong& b) = 0; virtual void operator()(Base_UShort& b) = 0; }; struct Base_Int : public Base { void accept_visitor(Visitor_Base& visitor) { visitor(*this); } }; struct Base_Long : public Base { void accept_visitor(Visitor_Base& visitor) { visitor(*this); } }; struct Base_Short : public Base { void accept_visitor(Visitor_Base& visitor) { visitor(*this); } }; struct Base_UInt : public Base { void accept_visitor(Visitor_Base& visitor) { visitor(*this); } }; struct Base_ULong : public Base { void accept_visitor(Visitor_Base& visitor) { visitor(*this); } }; struct Base_UShort : public Base { void accept_visitor(Visitor_Base& visitor) { visitor(*this); } }; Now that the foundation is laid, here is where the kicker comes in (templated methods): struct Visitor_Cout : public Visitor { template <class Receiver> void operator() (Receiver& r) { std::cout << "Visitor_Cout method not implemented.\n"; } }; Intentionally, Visitor_Cout does not contain the keyword virtual in the method declaration. All the other attributes of the method signatures match the parent declaration (or perhaps specification). In the big picture, this design allows developers to implement common visitation functionality that differs only by the type of the target object (the object receiving the visit). The implementation above is my suggestion for alerts when the derived visitor implementation hasn't implement an optional method. Is this legal by the C++ specification? (I don't trust when some says that it works with compiler XXX. This is a question against the general language.)

Read the article
Conditional Statement as Hierarchy jQuery

- by Jacob Lowe

Is there a way to make jQuery use objects in a conditional statement as an object in a hierarchy. For Example, I want to validate that something exist then tell it to do something just using the this selector. Like this if ($(".tnImg").length) { //i have to declare what object I am targeting here to get this to work $(this).animate({ opacity: 0.5, }, 200 ); } Is there a way to get jQuery to do this? I guess theres not a huge benefit but i still am curious

Read the article
Socket server with multiple clients, sending messages to many clients without hurting liveliness

- by Karl Johanson

I have a small socket server, and I need to distribute various messages from client-to-client depending on different conditionals. However I think I have a small problem with livelyness in my current code, and is there anything wrong in my approach: public class CuClient extends Thread { Socket socket = null; ObjectOutputStream out; ObjectInputStream in; CuGroup group; public CuClient(Socket s, CuGroup g) { this.socket = s; this.group = g; out = new ObjectOutputStream(this.socket.getOutputStream()); out.flush(); in = new ObjectInputStream(this.socket.getInputStream()); } @Override public void run() { String cmd = ""; try { while (!cmd.equals("client shutdown")) { cmd = (String) in.readObject(); this.group.broadcastToGroup(this, cmd); } out.close(); in.close(); socket.close(); } catch (Exception e) { System.out.println(this.getName()); e.printStackTrace(); } } public void sendToClient(String msg) { try { this.out.writeObject(msg); this.out.flush(); } catch (IOException ex) { } } And my CuGroup: public class CuGroup { private Vector<CuClient> clients = new Vector<CuClient>(); public void addClient(CuClient c) { this.clients.add(c); } void broadcastToGroup(CuClient clientName, String cmd) { Iterator it = this.clients.iterator(); while (it.hasNext()) { CuClient cu = (CuClient)it.next(); cu.sendToClient(cmd); } } } And my main-class: public class SmallServer { public static final Vector<CuClient> clients = new Vector<CuClient>(10); public static boolean serverRunning = true; private ServerSocket serverSocket; private CuGroup group = new CuGroup(); public void body() { try { this.serverSocket = new ServerSocket(1337, 20); System.out.println("Waiting for clients\n"); do { Socket s = this.serverSocket.accept(); CuClient t = new CuClient(s,group); System.out.println("SERVER: " + s.getInetAddress() + " is connected!\n"); t.start(); } while (this.serverRunning); } catch (IOException ex) { ex.printStackTrace(); } } public static void main(String[] args) { System.out.println("Server"); SmallServer server = new SmallServer(); server.body(); } } Consider the example with many more groups, maybe a Collection of groups. If they all synchronize on a single Object, I don't think my server will be very fast. I there a pattern or something that can help my liveliness?

Read the article
Listing available COM Objects with Powershell

- by outtacontrol

I am currently using the following script to list the available COM Objects on my machine. $path = "REGISTRY::HKEY_CLASSES_ROOT\CLSID\*\PROGID" foreach ($obj in dir $path) { write-host $obj.GetValue("") } I read on another website that the existence of the InProcServer32 key is evidence that the object is 64 bit compatible. So using powershell how can I determine the existence of InProcServer32 for each COM Object? If that is even the correct way of establishing whether it is 32 bit or 64 bit.

Read the article
ASP.net Create Self re-caching objects?

- by BlackTea

How can i make a cached object re-cache it self with updated info when the cache has expired? I'm trying to prevent the next user who request the cache to have to deal with getting the data setting the cache then using it is there any background method/event i can tie the object to so that when it expires it just calls the method it self and self-caches.

Read the article
Javascript objects: get parent

- by Dänu

Hey Guys I got some question about nested javascript objects (is it me, google or is javascript quite poorly documented?). Now, I got the following (nested) object: obj: { subObj: { 'hello world' } }; next thing I do is to reference the subobject like this: var s = obj.subObj; now what I would like to do, is to get a reference to the object obj out of the variable s. Something like: var o = s.parent; Is this somehow possible?

Read the article
Where about should my main class be created in a project?

- by Dan

The problem is where a class should be created in my code. An example is I have a UI class and a main logic class that controls other objects. Should the main logic class create the UI object, or should the UI object create the instance of the main logic class? An explanation of which method is best and why would be ideal. Thanks.

Read the article
Does Javascript have classes?

- by Glycerine

A friend and I had an argument last week. He stated there were no such things as classes in Javascript. I said there was as you can say var object = new Object() he says "as there is no word class used. Its not a class. -- Whats your take on it guys? thanks.

Read the article
Objects With No Behavior

- by Patrick Donovan

I've been teaching myself object oriented programming and I'm thinking about a situation where I have an object "Transaction", that has quite a few properties to it like account, amount, date, currency, type, etc. I never plan to mutate these data points, and calculation logic will live in other classes. My question is, is it poor Python design to instantiate thousands of objects just to hold data? I find the data far easier to work with embedded in a class rather than trying to cram it into some combination of data structures.

Read the article
Objective-C++ Memory Problem

- by Stephen Furlani

Hello, I'm having memory woes. I've got a C++ Library (Equalizer from Eyescale) and they use the Traversal Visitor Pattern to allow you to add new functionality to their classes. I've finally figured out how it works, and I've got a Visitor that just returns the properties from one of the objects. (since I don't know how they're allocated). so. My little code does this: VisitorResult AGLContextVisitor::visit( Channel* channel ) { // Search through Nodes, Pipes until we get to the right window. // Add some code to make sure we find the right one? // Not executing the following code as C++ in gdb? eq::Window* w = channel->getWindow(); OSWindow* osw = w->getOSWindow(); AGLWindow* aw = (AGLWindow *)osw; AGLContext agl_ctx = aw->getAGLContext(); this->setContext(agl_ctx); return TRAVERSE_PRUNE; } So here's the problem. eq::Window* w = channel->getWindow(); (gdb) print w 0x0 BUT If I do this: (gdb) set objc-non-blocking-mode off (gdb) print w=channel->getWindow() 0x300effb9 // an honest memory location, and sets w as verified in the Debugger window of XCode. It does the same thing for osw. I don't get it. Why would something work in (gdb) but not in the code? The file is completely a cpp file, but it seems to be running in objc++, since I need to turn blocking off. Help!? I feel like I'm missing some memory-management basic thing here, either with C++ or Obj-C. [edit] channel-getWindow() is supposed to do this: /** @return the parent window. @version 1.0 */ Window* getWindow() { return _window; } The code also executes fine if I run it from a C++-only application. [edit] No... I tried creating a simple stand-alone program since I was tired of running it as a plugin. Messy to debug. And no, it doesn't run in the C++ program either. So I'm really at a loss as to what I'm doing wrong. Thanks, -- Stephen Furlani

Read the article
Versant OQL Statement with an Arithmetic operator

- by Pascal

I'm working on a c# project that use a Versant Object Database back end and I'm trying to build a query that contains an arithmetic operator. The documentation states that it is supported but lack any example. I'm trying to build something like this: SELECT * FROM _orderItemObject WHERE _qtyOrdered - _qtySent > 0 If I try this statement in the Object Inspector I get a synthax error near the '-'. Anyone has an example of a working VQL with that kind of statement? Thanks

Read the article
Javascript (JSON) Objects: Get Parent

- by Dänu

Hey Guys I got some question about nested javascript objects (is it me, google or is javascript quite poorly documented?). Now, I got the following (nested) object: obj: { subObj: { 'hello world' } }; next thing I do is to reference the subobject like this: var s = obj.subObj; now what I would like to do, is to get a reference to the object obj out of the variable s. Something like: var o = s.parent; Is this somehow possible?

Read the article
java objects, shared variables

- by raven

hello, I have a simple question here. If I declare a variable inside an object which was made [declared] in the main class, like this: public static int number; ( usually I declare it like this : private int number; ) can it be used in a different object which was also made [declared] in the main class? btw I do not care about security atm, I just want to make something work, don't care about protection)

Read the article
Do Java or C++ lack any OO features?

- by tsv

I am interested in understanding object-oriented programming in a more academic and abstract way than I currently do, and want to know if there are any object-oriented concepts Java and C++ fail to implement. I realise neither of the languages are "pure" OO, but I am interested in what (if anything) they lack, not what they have extra.

Read the article
How do I define my own operators in the Io programming language?

- by klep

I'm trying to define my own operator in Io, and I'm having a hard time. I have an object: MyObject := Object clone do( lst := list() !! := method(n, lst at(n)) ) But when I call it, like this: x := MyObject clone do(lst appendSeq(list(1, 2, 3))) x !! 2 But I get an exception that argument 0 to at must not be nil. How can I fix?

Read the article
Access specifier's and classes and objects?

- by TimothyTech

Alright, im trying to understand this, so a class is simply creating a template for an object. class Bow { int arrows; }; and an object is simply creating a specific item using the class template. Bow::Bow(arrows) { arrows = 20; } also another question, public specifiers are used to make data members avaible in objects and private specifiers are used to make data memebers only avaialble inside the class?

Read the article
image processing problem

- by riyana

i'm working on detecting shape of any object.i've a binary image where background is white pixels and foreground/object is black pixel. now i need to detect the shape of the area where there are black pixels.how can i do it?the shape may be of a man/car/box etc. plz help

Read the article

< Previous Page | 150 151 152 153 154 155 156 157 158 159 160 161 | Next Page >