Search Results

Search found 351 results on 15 pages for 'theoretical'.

Page 11/15 | < Previous Page | 7 8 9 10 11 12 13 14 15 | Next Page >

Being a more attractive job candidate - Certs XOR Degree

- by Zephyr Pellerin

I'm currently working in an IT position, where I do helpdesk stuff, and predominantly security related issues/consulting (In the loosest sense of the term) In-House and for Service-Contract clients (as the only/acting CCSP [I guess I should say only person with Cisco experience] in my organization). I've professionally written Kernel Mode drivers for a gaming company. Among other things that I'm proud to put on a resume. I think of myself as very reasonably qualified as a System Administrator, With excellent Cisco experience, among other things I think would make a good addition to almost any IT staff in need of a new employee. However, Something has always tripped me up - Human Resources. Let me explain, I decided to skip the university route - I'm immensely glad that I did, The computer science graduates that I've met and work with rarely know much of anything about Computers (Until they gain some 'real' experience), Even when asked about Theoretical Computing fundamentals they can rattle something off about Turing Completeness but rarely do they understand the mathematical underpinnings. In short, I think instead of going to college, I'd rather pick up some real world experience. However, Apparently, Employers rarely think the same way. A quick perusal of jobs through the standard job search engine yields nothing short of a conspiracy to exclude anyone without 'A Bachelors Degree in Computer Science or Equivalent'. Interviews I've had in the past have almost always been entangled with - 1. My Age (Which I can't really change) and 2. Lack of Degree. Employers frequently disregard the CCNA/CCSP, The experience I've gained through internships, My extensive experience in x86 assembly and C, among so many other things I like to think are valuable to employers - In lieu of the fact that I don't have a piece of paper. So, AS AN EMPLOYER - Is it even worth working on my CCIE? Or should I pad my resume with certifications that are easier to acquire (Like CISSP, MSCE, Network+, etc.). Or should I ditch the whole idea and head back to get a Mathematics or CS degree?

Read the article
Testing for disk write

- by Montecristo

I'm writing an application for storing lots of images (size <5MB) on an ext3 filesystem, this is what I have for now. After some searching here on serverfault I have decided for a structure of directories like this: 000/000/000000001.jpg ... 236/519/236519107.jpg This structure will allow me to save up to 1'000'000'000 images as I'll store a max of 1'000 images in each leaf. I've created it, from a theoretical point of view seems ok to me (though I've no experience on this), but I want to find out what will happen when there will be directories full of files in there. A question about creating this structure: is it better to create it all in one go (takes approx 50 minutes on my pc) or should I create directories as they are needed? From a developer point of view I think the first option is better (no extra waiting time for the user), but from a sysadmin point of view, is this ok? I've thought I could do as if the filesystem is already under the running application, I'll make a script that will save images as fast as it can, monitoring things as follows: how much time does it take for an image to be saved when there is no or little space used? how does this change when the space starts to be used up? how much time does it take for an image to be read from a random leaf? Does this change a lot when there are lots of files? Does launching this command sync; echo 3 | sudo tee /proc/sys/vm/drop_caches has any sense at all? Is this the only thing I have to do to have a clean start if I want to start over again with my tests? Do you have any suggestions or corrections?

Read the article
Which is more secure: Tomcat standalone or Tomcat behind Apache?

- by NoozNooz42

This question is not about performance, nor about load-balancing, etc. Which would be more secure: running Tomcat in standalone mode or running Tomcat behind apache? The thing is, Tomcat is written in Java and hence it is pretty much immune to buffer overrun/overflow (unless a buffer overrun in a C-written lib used by Tomcat can be triggered, but they're rare [the last I remember was in zlib, many many moons ago] and one heck of a hack to actually exploit), which gets rid of a lot of potential exploits. This page: http://wiki.apache.org/tomcat/FAQ/Security has this to say: There have been no public cases of damage done to a company, organization, or individual due to a Tomcat security issue... there have been only theoretical vulnerabilities found. All of those were addressed even though there were no documented cases of actual exploitation of these vulnerabilities. This, combined with the fact that buffer overrun/overflow are pretty much non-existent in Java, makes me believe that Tomcat in standalone mode is pretty secure. In addition to that, I can install both Java and Tomcat on Linux without needing to be root. The only moment I need to be root is to set up a transparent port 8080 to port 80 forwarding (and 8443 to 443). Two iptables line as root, that's all root is needed for. (I don't know for Apache). Apache is much more used than Tomcat and definitely does not have a security track record as good as Tomcat. What would make Tomcat + Apache more secure? What would make Tomcat + Apache less secure? In short: which is more secure, Tomcat standalone or Tomcat with Apache? (remembering that performance aren't an issue here)

Read the article
Server 2012, Jumbo Frames - should I expect problems?

- by TomTom

Ok, this sound might stupid - but is there any negative on just enabling jumbo frames in practice? From what I understand: Any switch or ethernet adapter that sees a jumbo frame it can not handle will just drop it. TCP is not a problem as max frame size is negotiated in the setinuo phase. UCP is a theoretical problem as a server may just send a LARGE UDP packet that gets dropped on the way. Practically though, as UDP is packet based, I do not really think any software WOULD send a UDP packet larger than 1500 bytes net without app level configuration changes - at least this is how I do my programming, as it is quite hard to get a decent MTU size for that without testing yourself, so you fall back in programming to max 1500 packets. The network in question is a standard small business network - we upgraded now from a non managed 24 port switch to a 52 port switch with 4 10g ports (netgear - quite cheap) and will mov a file server to 10g for also ISCSI serving. All my equipment on the Ethernet level can handle minimum 9000 bytes and due to local firewalls I really want to get packets larger (less firewall processing), but the network is also NAT'ed to the internet. On top, different machines move around (download) large files (multi gigabyte area) quite often for processing. The question is - can I expect problems when I just enable jumbo frames? Again, this is not totally ignorance - I just don't see programs sending more than 1500 byte UDP packets (if that is a practical problem please tell me) and for TCP the MTU is negotiated anyway. if there is a problem I can move to a dedicated VLAN, but this has it's own shares of problems as basically most workstations must then be on both VLAN's.

Read the article
DNS failover in a two datacenter scenario

- by wanson

I'm trying to implement a low-cost solution for website high availability. I'm looking for the downsides of the following scenario: I have two servers with the same configuration, content, mysql replication (dual-master). They are in different datacenters - let's call them serverA and serverB. Users use serverA - serverB is more like a backup. Now, I want to use DNS failover, to switch users from serverA to serverB when serverA goes down. My idea is that I setup DNS servers (bind/powerdns) on serverA and serverB - let's call them ns1.website.com and ns2.website.com (assuming I own website.com). Then I configure my domain to use them as its nameservers. Both DNS servers will return serverA IP as my website's IP. If serverA goes down I can (either manually or automatically from serverB) change configuration of serverB's DNS, to return IP of serverB as website's IP. Of course the TTL will be low, as it's supposed to be in DNS failovers. I know that it may take some time to switch to serverB (DNS ttl, time to detect serverA failure, serverB DNS reconfiguration etc), and that some small part of users won't use serverB anyway. And I'm OK with that. But what are other downsides of such an approach? An alternative scenario is that ns1.website.com will return serverA IP as website's IP, and ns2.website.com will return serverB IP as website's IP. But AFAIK clients not always use primary nameserver and sometimes would use secondary one. So some small part of users would use serverB instead of serverA which is not quite what I'd like. Can you confirm that DNS clients behave like that and can you tell what percentage of clients would possibly use serverB instead of serverA (statistically)? This one also has the downside that when serverA goes back up, it will be automatically used as website's primary server, which is also a bad situation (cold cache, mysql replication could fail in the meantime etc). So I'm adding it only as a theoretical alternative. I was thinking about using some professional DNS failover companies but they charge for the number of DNS requests and the fees are very high (why?)

Read the article
scalable yet doable small-medium office network

- by Jared

Hello, I'm studying up with both Microsoft and Cisco literature and I must say, my head is starting to get clustered up (pun intended). I've made a quick network diagram of a theoretical company... Company1 owns Company 2 and Company 3, which are all under separate rooms and networks, but must be able to share a few resources such as files or printers. Given the amount of info out there and best practices, I thought about posting here to get suggestions and see what would the pro's do. I can read and read all day and implement on my own, but if I dont get some outside input, how will I know if I'm doing something wrong, right? anyway, please take a look and see if this is an over-complicated network or a lackluster design for a small-medium company of about 35 people and lets say they will be double that number by end of the year... :) Using win2k3, esxi, windows xp. FCS - forefront client security, ACS - access control system, SPCWK - spiceworks, XCH - Exchange Im not allowed to post an image yet, so here's the link ---- GLIFFY IMAGE Flame suit is on just in case people get mad at me for making an "abomination". I'd really want to get the general overview properly before I dive into the more complicated things

Read the article
TCP Server Memory management: #Connections Vs. #Requests

- by Andrew

Given that, there is no theoretical limit to number of concurrent TCP connections a Windows 2008 server can handle. Only thing will happen is, with each connection there will be memory consumption in server. Unfortunately, memory is not unlimited (and I want to utilize only physical memory). For example, lets say we've 2GB server memory. Now there are two extreme cases: Case 1: If we've allocated 64KB buffer for each connection (only to receive incoming request), then 32768 connections can consume all the 2GB of memory. This will not leave any memory to queue/process incoming requests from those connections. Case 2: On the other hand, lets say a single (or very few) connections continuously keeps sending request buffers (for example, video streaming from one connection to other) and server cannot process them within time, those buffers will get piled up in server and eventually will occupy most of the servers memory. And it will not leave any memory for new connection thereafter. This is the real dilemma in server design bugging me badly for last many days. If I can decide on max size of request buffer per connection and max number of requests to allow in queue per connection. Then, based on available server memory, it will then automatically set limit on max number of concurrent connections. How to decide on these limits to achieve best performance and throughput? I am just looking for perfect utilization of server resources. Are there any standard guidelines or empirical data available with someone who can share with me please.

Read the article
What is the max supported number of SATA devices (using cable adapters) on a Dell SAS 6/iR adapter?

- by Zac B

I've got a Dell SAS 6/iR PCI-E adapter. I don't have a multiplier backplane. I'm planning on connecting SATA (non SAS) drives. If I buy cable adapters only (ones that split a SAS connector on the card to a certain number of SATA cables), how many drives can I connect to this card? The way I see it, there are two limitations: a limitation imposed by the theoretical max number of devices supported on the card (which I've dug through the specs to find, but haven't seen yet), and a limitation imposed by the number of SAS plugs on the card multiplied by the number of SATA cables that come out of the highest-multiplying splitter I can buy. The answer to my question would be the minimum of those two limitations. I've seen 4x SATA coming out of some splitters; are there any that have more? Alternatively, if this is an RTFM question, does anyone have a good link to a "this is how SAS works, this is how you figure out the max number of devices, and this is how the concepts of 'ports', 'lanes', 'endpoint devices', and 'connectors' all relate in SAS-land" document? I've looked around on the Dell docs, but haven't found anything that explains this to someone at my level of understanding of SAN/enterprise storage technologies. Cheers!

Read the article
How do I lower the hardware volume? (volume too high)

- by Zom-B

I have a 4yo Dell laptop with Windows XP Pro (modern ones unfortunately don't have a physical volume knob), and lately I'm using my Apple earphones, because they have much better low frequency response than my $10 earphones. They also have the side effect of being much louder. To give an indication of my agony, for most tasks (movie, music, games) I have my main volume at 3 ticks: drag to 0 with the mouse and press the up key 3 times (the handle does not even raise 1 pixel) and my wave volume at 50%. I notice that when I do this, I have a lot of digital noise, because I'm using just a tiny fraction of the 16-bit space. If I drag the Wave slider down until I barely hear the audio, it becomes really distorted and noisy, indicating that this is digital volume (in the DirectSound driver or something) and not hardware volume. I experimented in Audition. When I make a tone of 1000Hz at -50db, (all windows volumes at max) the volume is just below my pain threshold. When I zoom in to see how high the sample values reach, I see that just 8 of the 16 bits are used (about -100 ~ 100). When I generate such tone at -80db (minimum I can specify) then I can still clearly hear the tone, although really noisy. When I zoom in, I see that just 3 out of 16 bits are used. I created a squarewave tone that is just 1 bit high, and I can still hear it! For most uses, this is not a big problem (audiophiles will disagree!), as I just have more noise than usual (about the same as old 8 bit hardware), but I'm also in the process of programming a hearing test program, in which case this problem is a death blow as the test subjects will even hear tones at the bottom of the theoretical range (lowering the windows volume is futile, see above) (I cannot update drivers, as Dell has discontinued XP support for my model)

Read the article
2 Computers, same network, different outgoing speeds when uploading to internet?

- by user117339

I have 2 work machines in my office, a PowerMac G5 and a MacBook Air. Both behind an IPCop firewall. The PowerMac is connected through a gigabit switch, the MacBook Air is connected through a Netgear 802.11g access point that is then plugged into the gigabit switch. There is also a FreeNAS box, both machines are able to read and write files to it at close to their pipe speeds. The main problem is when I am trying to upload files to the internet at large. The G5 is only hitting 0.1 - 0.25 Mbps. The Macbook is able to hit 2-3 Mbps. The setup (G5 / IPCop / Network) has been the same for 5 years. The issues with the internet speed started about 3 months ago. I hadn't tested on the Macbook at this point. I had complained to the ISP, they said their modem needed a firmware update, did that nothing changed. Reset IPCop, turned off squid, etc. No changes. The ISP switched the office over to a better plan with a theoretical 6 Mbps up, still no change. At this point I tried testing the Macbook, and lo and behold there's the speed. But why? I have tried changing out everything, cables, switches, using another ethernet port on the G5, wiping the system, using DHCP, using manual IPs, changing DNS servers, etc. Nothing works. I figured that if there was something horribly wrong with the network, then internally I would find a similar issue, but that is perfect. iperf, ping, etc show no dropped packets and near saturation of the internal network. I'm at a loss as to what the heck is going on. Any ideas would be appreciated! Below are some screenshots of speedtest.net: G5: Macbook Air:

Read the article
Replicated filesystem and EC2 MySQL

- by El Yobo

I'm currently investigating migrating our infrastructure over to run on Amazon's EC2 and am trying to figure out the best way to set up a MySQL service. I'm leaning towards running our own MySQL instances, rather than going with Amazon's RDS, but am still considering the best approach for performance and cost on the instance itself. In order to have persistent data, the MySQL data needs to be on an EBS volume (with some form of striped RAID, e.g. RAID0 or RAID10) to improve persistence. However, EBS IO is limited by the network interface (gigabit, so a theoretical maximum of 128 MB/s), while the ephemeral volumes have no such problem. I did see a suggestion for running two MySQL servers on an instance, with a master running on the ephemeral disk (which we would also RAID) and a slave storing changes to an EBS volume, but this has some additional overhead and complexity (two servers). What I was imagining is using some form of replicated file system such that I could have a filesystem on top of a RAID0 of ephemeral volumes to maximise performance all changes from the above immediately replicated to another RAID1 volume backed by multiple EBS volumes to ensure no data loss The advantages of this would be best possible IO performance for the DB server; no network delay in IO decreased IO on EBS volumes (as all read IO will be done on the ephemeral volumes) so decreased cost good data security, as it's backed onto redundant EBS volumes However, I haven't seen an appropriate system to replicate all changes from one volume to the other; is there a filesystem, or any other approach, which will do this? The distributed file systems, e.g. GlusterFS, DRBD etc seem to focus on replicating disks between servers, can they be set up to do what I'm interested in here? I also haven't seen anything about other's taking this approach. Do I have a solution in need of a problem here (i.e. is performance good enough, so this whole idea is redundant)? Is there some flaw in the plan?

Read the article
Throughput and why do ISPs sell too much bandwidth?

- by jonescb

I hope the question made sense how I worded it. :) I've been wondering, maximum theoretical bandwidth is measured as RWIN/RTT (Window size / round trip time) Source 1 and Souce 2 So if a major city only 100 miles away gives me a ping of 50ms, and I have the default 64kb TCP window size then my maximum throughput will be 12.5Mb/s. Everything further away would give me a higher ping and therefore a lower throughput. Is there any reason to buy something like FiOS with a 50Mb/s or greater connection? Will you ever be able to reach that kind of speed? I know you can increase the TCP window size to increase throughput, but it has to be at both ends which is a deal breaker because you can't control the server. I'm assuming other network protocols like UDP aren't quite as affected by latency as TCP is, but how much of overall network traffic does non-TCP make up vs TCP. Am I just misguided about how throughput works? But if the above is correct, then why should a consumer like me buy way more bandwidth than can be realistically used. Maybe the only reason is for downloading multiple things at once, or one thing from multiple servers/peers?

Read the article
Type Casting variables in PHP: Is there a practical example?

- by Stephen

PHP, as most of us know, has weak typing. For those who don't, PHP.net says: PHP does not require (or support) explicit type definition in variable declaration; a variable's type is determined by the context in which the variable is used. Love it or hate it, PHP re-casts variables on-the-fly. So, the following code is valid: $var = "10"; $value = 10 + $var; var_dump($value); // int(20) PHP also alows you to explicitly cast a variable, like so: $var = "10"; $value = 10 + $var; $value = (string)$value; var_dump($value); // string(2) "20" That's all cool... but, for the life of me, I cannot conceive of a practical reason for doing this. I don't have a problem with strong typing in languages that support it, like Java. That's fine, and I completely understand it. Also, I'm aware of—and fully understand the usefulness of—type hinting in function parameters. The problem I have with type casting is explained by the above quote. If PHP can swap types at-will, it can do so even after you force cast a type; and it can do so on-the-fly when you need a certain type in an operation. That makes the following valid: $var = "10"; $value = (int)$var; $value = $value . ' TaDa!'; var_dump($value); // string(8) "10 TaDa!" So what's the point? Can anyone show me a practical application or example of type casting—one that would fail if type casting were not involved? I ask this here instead of SO because I figure practicality is too subjective. Edit in response to Chris' comment Take this theoretical example of a world where user-defined type casting makes sense in PHP: You force cast variable $foo as int -- (int)$foo. You attempt to store a string value in the variable $foo. PHP throws an exception!! <--- That would make sense. Suddenly the reason for user defined type casting exists! The fact that PHP will switch things around as needed makes the point of user defined type casting vague. For example, the following two code samples are equivalent: // example 1 $foo = 0; $foo = (string)$foo; $foo = '# of Reasons for the programmer to type cast $foo as a string: ' . $foo; // example 2 $foo = 0; $foo = (int)$foo; $foo = '# of Reasons for the programmer to type cast $foo as a string: ' . $foo; UPDATE Guess who found himself using typecasting in a practical environment? Yours Truly. The requirement was to display money values on a website for a restaurant menu. The design of the site required that trailing zeros be trimmed, so that the display looked something like the following: Menu Item 1 .............. $ 4 Menu Item 2 .............. $ 7.5 Menu Item 3 .............. $ 3 The best way I found to do that wast to cast the variable as a float: $price = '7.50'; // a string from the database layer. echo 'Menu Item 2 .............. $ ' . (float)$price; PHP trims the float's trailing zeros, and then recasts the float as a string for concatenation.

Read the article
Threading Overview

- by ACShorten

One of the major features of the batch framework is the ability to support multi-threading. The multi-threading support allows a site to increase throughput on an individual batch job by splitting the total workload across multiple individual threads. This means each thread has fine level control over a segment of the total data volume at any time. The idea behind the threading is based upon the notion that "many hands make light work". Each thread takes a segment of data in parallel and operates on that smaller set. The object identifier allocation algorithm built into the product randomly assigns keys to help ensure an even distribution of the numbers of records across the threads and to minimize resource and lock contention. The best way to visualize the concept of threading is to use a "pie" analogy. Imagine the total workset for a batch job is a "pie". If you split that pie into equal sized segments, each segment would represent an individual thread. The concept of threading has advantages and disadvantages: Smaller elapsed runtimes - Jobs that are multi-threaded finish earlier than jobs that are single threaded. With smaller amounts of work to do, jobs with threading will finish earlier. Note: The elapsed runtime of the threads is rarely proportional to the number of threads executed. Even though contention is minimized, some contention does exist for resources which can adversely affect runtime. Threads can be managed individually – Each thread can be started individually and can also be restarted individually in case of failure. If you need to rerun thread X then that is the only thread that needs to be resubmitted. Threading can be somewhat dynamic – The number of threads that are run on any instance can be varied as the thread number and thread limit are parameters passed to the job at runtime. They can also be configured using the configuration files outlined in this document and the relevant manuals.Note: Threading is not dynamic after the job has been submitted Failure risk due to data issues with threading is reduced – As mentioned earlier individual threads can be restarted in case of failure. This limits the risk to the total job if there is a data issue with a particular thread or a group of threads. Number of threads is not infinite – As with any resource there is a theoretical limit. While the thread limit can be up to 1000 threads, the number of threads you can physically execute will be limited by the CPU and IO resources available to the job at execution time. Theoretically with the objects identifiers evenly spread across the threads the elapsed runtime for the threads should all be the same. In other words, when executing in multiple threads theoretically all the threads should finish at the same time. Whilst this is possible, it is also possible that individual threads may take longer than other threads for the following reasons: Workloads within the threads are not always the same - Whilst each thread is operating on the roughly the same amounts of objects, the amount of processing for each object is not always the same. For example, an account may have a more complex rate which requires more processing or a meter has a complex amount of configuration to process. If a thread has a higher proportion of objects with complex processing it will take longer than a thread with simple processing. The amount of processing is dependent on the configuration of the individual data for the job. Data may be skewed – Even though the object identifier generation algorithm attempts to spread the object identifiers across threads there are some jobs that use additional factors to select records for processing. If any of those factors exhibit any data skew then certain threads may finish later. For example, if more accounts are allocated to a particular part of a schedule then threads in that schedule may finish later than other threads executed. Threading is important to the success of individual jobs. For more guidelines and techniques for optimizing threading refer to Multi-Threading Guidelines in the Batch Best Practices for Oracle Utilities Application Framework based products (Doc Id: 836362.1) whitepaper available from My Oracle Support

Read the article
How the SPARC T4 Processor Optimizes Throughput Capacity: A Case Study

- by Ruud

This white paper demonstrates the architected latency hiding features of Oracle’s UltraSPARC T2+ and SPARC T4 processors That is the first sentence from this technical white paper, but what does it exactly mean? Let's consider a very simple example, the computation of a = b + c. This boils down to the following (pseudo-assembler) instructions that need to be executed: load @b, r1 load @c, r2 add r1,r2,r3 store r3, @a The first two instructions load variables b and c from an address in memory (here symbolized by @b and @c respectively). These values go into registers r1 and r2. The third instruction adds the values in r1 and r2. The result goes into register r3. The fourth instruction stores the contents of r3 into the memory address symbolized by @a. If we're lucky, both b and c are in a nearby cache and the load instructions only take a few processor cycles to execute. That is the good case, but what if b or c, or both, have to come from very far away? Perhaps both of them are in the main memory and then it easily takes hundreds of cycles for the values to arrive in the registers. Meanwhile the processor is doing nothing and simply waits for the data to arrive. Actually, it does something. It burns cycles while waiting. That is a waste of time and energy. Why not use these cycles to execute instructions from another application or thread in case of a parallel program? That is exactly what latency hiding on the SPARC T-Series processors does. It is a hardware feature totally transparent to the user and application. As soon as there is a delay in the execution, the hardware uses these otherwise idle cycles to execute instructions from another process. As a result, the throughput capacity of the system improves because idle cycles are no longer wasted and therefore more jobs can be run per unit of time. This feature has been in the SPARC T-series from the beginning, so why this paper? The difference with previous publications on this topic is in the amount of detail given. How this all works under the hood is fully explained using two example programs. Starting from the assembly language instructions, it is demonstrated in what way these programs execute. To really see what is happening we go down to the processor pipeline level, where the gaps in the execution are, and show in what way these idle cycles are filled by other copies of the same program running simultaneously. Both the SPARC T4 as well as the older UltraSPARC T2+ processor are covered. You may wonder why the UltraSPARC T2+ is included. The focus of this work is on the SPARC T4 processor, but to explain the basic concept of latency hiding at this very low level, we start with the UltraSPARC T2+ processor because it is architecturally a much simpler design. From the single issue, in-order pipelines of this processor we then shift gears and cover how this all works on the much more advanced dual issue, out-of-order architecture of the T4. The analysis and performance experiments have been conducted on both processors. The results depend on the processor, but in all cases the theoretical estimates are confirmed by the experiments. If you're interested to read a lot more about this and find out how things really work under the hood, you can download a copy of the paper here. A paper like this could not have been produced without the help of several other people. I want to thank the co-author of this paper, Jared Smolens, for his very valuable contributions and our highly inspiring discussions. I'm also indebted to Thomas Nau (Ulm University, Germany), Shane Sigler and Mark Woodyard (both at Oracle) for their feedback on earlier versions of this paper. Karen Perkins (Perkins Technical Writing and Editing) and Rick Ramsey at Oracle were very helpful in providing editorial and publishing assistance.

Read the article
I am trying to figure out the best way to understand how to cache domain objects

- by Brett Ryan

I've always done this wrong, I'm sure a lot of others have too, hold a reference via a map and write through to DB etc.. I need to do this right, and I just don't know how to go about it. I know how I want my objects to be cached but not sure on how to achieve it. What complicates things is that I need to do this for a legacy system where the DB can change without notice to my application. So in the context of a web application, let's say I have a WidgetService which has several methods: Widget getWidget(); Collection<Widget> getAllWidgets(); Collection<Widget> getWidgetsByCategory(String categoryCode); Collection<Widget> getWidgetsByContainer(Integer parentContainer); Collection<Widget> getWidgetsByStatus(String status); Given this, I could decide to cache by method signature, i.e. getWidgetsByCategory("AA") would have a single cache entry, or I could cache widgets individually, which would be difficult I believe; OR, a call to any method would then first cache ALL widgets with a call to getAllWidgets() but getAllWidgets() would produce caches that match all the keys for the other method invocations. For example, take the following untested theoretical code. Collection<Widget> getAllWidgets() { Entity entity = cache.get("ALL_WIDGETS"); Collection<Widget> res; if (entity == null) { res = loadCache(); } else { res = (Collection<Widget>) entity.getValue(); } return res } Collection<Widget> loadCache() { // Get widgets from underlying DB Collection<Widget> res = db.getAllWidgets(); cache.put("ALL_WIDGETS", res); Map<String, List<Widget>> byCat = new HashMap<>(); for (Widget w : res) { // cache by different types of method calls, i.e. by category if (!byCat.containsKey(widget.getCategory()) { byCat.put(widget.getCategory(), new ArrayList<Widget>); } byCat.get(widget.getCatgory(), widget); } cacheCategories(byCat); return res; } Collection<Widget> getWidgetsByCategory(String categoryCode) { CategoryCacheKey key = new CategoryCacheKey(categoryCode); Entity ent = cache.get(key); if (entity == null) { loadCache(); } ent = cache.get(key); return ent == null ? Collections.emptyList() : (Collection<Widget>)ent.getValue(); } NOTE: I have not worked with a cache manager, the above code illustrates cache as some object that may hold caches by key/value pairs, though it's not modelled on any specific implementation. Using this I have the benefit of being able to cache all objects in the different ways they will be called with only single objects on the heap, whereas if I were to cache the method call invocation via say Spring It would (I believe) cache multiple copies of the objects. I really wish to try and understand the best ways to cache domain objects before I go down the wrong path and make it harder for myself later. I have read the documentation on the Ehcache website and found various articles of interest, but nothing to give a good solid technique. Since I'm working with an ERP system, some DB calls are very complicated, not that the DB is slow, but the business representation of the domain objects makes it very clumsy, coupled with the fact that there are actually 11 different DB's where information can be contained that this application is consolidating in a single view, this makes caching quite important.

Read the article
Why is the framerate (fps) capped at 60?

- by dennmat

ISSUE I recently moved a project from my laptop to my desktop(machine info below). On my laptop the exact same code displays the fps(and ms/f) correctly. On my desktop it does not. What I mean by this is on the laptop it will display 300 fps(for example) where on my desktop it will show only up to 60. If I add 100 objects to the game on the laptop I'll see my frame rate drop accordingly; the same test on the desktop results in no change and the frames stay at 60. It takes a lot(~300) entities before I'll see a frame drop on the desktop, then it will descend. It seems as though its "theoretical" frames would be 400 or 500 but will never actually get to that and only do 60 until there's too much to handle at 60. This 60 frame cap is coming from no where. I'm not doing any frame limiting myself. It seems like something external is limiting my loop iterations on the desktop, but for the last couple days I've been scratching my head trying to figure out how to debug this. SETUPS Desktop: Visual Studio Express 2012 Windows 7 Ultimate 64-bit Laptop: Visual Studio Express 2010 Windows 7 Ultimate 64-bit The libraries(allegro, box2d) are the same versions on both setups. CODE Main Loop: while(!abort) { frameTime = al_get_time(); if (frameTime - lastTime >= 1.0) { lastFps = fps/(frameTime - lastTime); lastTime = frameTime; avgMspf = cumMspf/fps; cumMspf = 0.0; fps = 0; } /** DRAWING/UPDATE CODE **/ fps++; cumMspf += al_get_time() - frameTime; } Note: There is no blocking code in the loop at any point. Where I'm at My understanding of al_get_time() is that it can return different resolutions depending on the system. However the resolution is never worse than seconds, and the double is represented as [seconds].[finer-resolution] and seeing as I'm only checking for a whole second al_get_time() shouldn't be responsible. My project settings and compiler options are the same. And I promise its the same code on both machines. My googling really didn't help me much, and although technically it's not that big of a deal. I'd really like to figure this out or perhaps have it explained, whichever comes first. Even just an idea of how to go about figuring out possible causes, because I'm out of ideas. Any help at all is greatly appreciated. EDIT: Thanks All. For any others that find this to disable vSync(windows only) in opengl: First get "wglext.h". It's all over the web. Then you can use a tool like GLee or just write your own quick extensions manager like: bool WGLExtensionSupported(const char *extension_name) { PFNWGLGETEXTENSIONSSTRINGEXTPROC _wglGetExtensionsStringEXT = NULL; _wglGetExtensionsStringEXT = (PFNWGLGETEXTENSIONSSTRINGEXTPROC) wglGetProcAddress("wglGetExtensionsStringEXT"); if (strstr(_wglGetExtensionsStringEXT(), extension_name) == NULL) { return false; } return true; } and then create and setup your function pointers: PFNWGLSWAPINTERVALEXTPROC wglSwapIntervalEXT = NULL; PFNWGLGETSWAPINTERVALEXTPROC wglGetSwapIntervalEXT = NULL; if (WGLExtensionSupported("WGL_EXT_swap_control")) { // Extension is supported, init pointers. wglSwapIntervalEXT = (PFNWGLSWAPINTERVALEXTPROC) wglGetProcAddress("wglSwapIntervalEXT"); // this is another function from WGL_EXT_swap_control extension wglGetSwapIntervalEXT = (PFNWGLGETSWAPINTERVALEXTPROC) wglGetProcAddress("wglGetSwapIntervalEXT"); } Then just call wglSwapIntervalEXT(0) to disable vSync and 1 to enable vSync. I found the reason this is windows only is that openGl actually doesn't deal with anything other than rendering it leaves the rest up to the OS and Hardware. Thanks everyone saved me a lot of time!

Read the article
Is there a perfect algorithm for chess?

- by Overflown

Dear Stack Overflow community, I was recently in a discussion with a non-coder person on the possibilities of chess computers. I'm not well versed in theory, but think I know enough. I argued that there could not exist a deterministic Turing machine that always won or stalemated at chess. I think that, even if you search the entire space of all combinations of player1/2 moves, the single move that the computer decides upon at each step is based on a heuristic. Being based on a heuristic, it does not necessarily beat ALL of the moves that the opponent could do. My friend thought, to the contrary, that a computer would always win or tie if it never made a "mistake" move (however do you define that?). However, being a programmer who has taken CS, I know that even your good choices - given a wise opponent - can force you to make "mistake" moves in the end. Even if you know everything, your next move is greedy in matching a heuristic. Most chess computers try to match a possible end game to the game in progress, which is essentially a dynamic programming traceback. Again, the endgame in question is avoidable though. -- thanks, Allan Edit: Hmm... looks like I ruffled some feathers here. That's good. Thinking about it again, it seems like there is no theoretical problem with solving a finite game like chess. I would argue that chess is a bit more complicated than checkers in that a win is not necessarily by numerical exhaustion of pieces, but by a mate. My original assertion is probably wrong, but then again I think I've pointed out something that is not yet satisfactorily proven (formally). I guess my thought experiment was that whenever a branch in the tree is taken, then the algorithm (or memorized paths) must find a path to a mate (without getting mated) for any possible branch on the opponent moves. After the discussion, I will buy that given more memory than we can possibly dream of, all these paths could be found.

Read the article
[OffTopic] PhD is it worth it?

- by Zenzen

So I'll be graduating next year (hopefully) and lately I started thinking about getting a PhD in computer science (to be more precise I was thinking about something related to "distributed systems", don't have any specific topics yet in mind), but is it really worth it? A PhD course here lasts 4 years and is free IF you're doing the "normal one" (which means you have class from monday to friday) OR you have to pay if you want to study on the weekends. So I've been thinking about getting a normal full time job (as a JEE programmer) and doing my PhD during the weekends OR doing the normal PhD course (mon-fri) and getting a part time job as a software dev (the main reason I want a job is simple - I need to eat). That would give me practical working experience and theoretical knowledge+a PhD, but it would also mean working 7days a week from 9 to 5 for 4 years straight (sounds like an overkill). Is it, job wise, really worth getting a PhD? As an employer do you prefer people with PhDs or MScs? This might be kinda important - I'm not from the US or western europe, how are PhDs from other countries (I am graduating one of the best, if not the best, school in my country but it's still a central european country...) seen? Are they useless? I won't hide it - after graduation I am thinking about moving abroad.

Read the article
Resources related to data-mining and gaming on social networks

- by darren

Hi all I'm interested in the problem of patterning mining among players of social networking games. For example detecting cheaters of a game, given a company's user database. So far I have been following the usual recipe for a data mining project: construct a data warehouse that aggregates significant information select a classifier, and train it with a subsectio of records from the warehouse validate classifier with another test set lather, rinse, repeat Surprisingly, I've found very little in this area regarding literature, best practices, etc. I am hoping to crowdsource the information gathering problem here. Specifically what I'm looking for: What classifiers have worked will for this type of pattern mining (it seems highly temporal, users playing games, users receiving rewards, users transferring prizes etc). Are there any highly agreed upon attributes specific to social networking / gaming data? What is a practical amount of information that should be considered? One problem I've run into is data overload, where queries and data cleansing may take days to complete. Related to point above, what hardware resources are required to produce results? I've found it difficult to estimate the amount of computing power I will require for production use. It has become apparent that a white box in the corner does not have enough horse-power for such a project. Are companies generally resorting to cloud solutions? Are they buying clusters? Basically, any resources (theoretical, academic, or practical) about implementing a social networking / gaming pattern-mining program would be very much appreciated. Thanks.

Read the article
what type of bug causes a program to slowly use more processor power and all of a sudden go to 100%?

- by reinier

Hi, I was hoping to get some good ideas as to what might be causing a really nasty bug. This is a program which is transmitting data over a socket, and also receives messages back. I could explain lots more, but I don't think this will help here. I'm just searching for hypothetical problems which can cause the following behaviour: program runs processor time slowly accumulates (till around 60%) all of a sudden (could be after 30 but also after 60 seconds) the processor time shoots to 100%. the program halts completely In my syslog it always ends on one line with a memory allocation (something similar to: myArray = new byte[16384]) in the same thread. now here is the weird part: if I set the debugger anywhere...it immediately stops on that line. So just the act of setting a breakpoint, made the thread continue (it wasn't running since I saw no log output anymore) I was thinking 'deadlock' but that would not cause 100% processor power. If anything, the opposite. Also, setting a breakpoint would not cause a deadlock to end. anyone else a theoretical suggestion as to what kind of 'construct' might cause this effect? (apart from 'bad programming') ;^) thanks EDIT: I just noticed.... by setting the sendspeed slower, the problem shows itself much later than expected. I would think around the same amount of packets send...but no the amount of packets send is much higher this way before it has the same problem.

Read the article
call/cc in Lua - Possible?

- by Pessimist

The Wikipedia article on Continuation says: "In any language which supports closures, it is possible to write programs in continuation passing style and manually implement call/cc." Either that is true and I need to know how to do it or it is not true and that statement needs to be corrected. If this is true, please show me how to implement call/cc in Lua because I can't see how. I think I'd be able to implement call/cc manually if Lua had the coroutine.clone function as explained here. If closures are not enough to implement call/cc then what else is needed? The text below is optional reading. P.S.: Lua has one-shot continuations with its coroutine table. A coroutine.clone function would allow me to clone it to call it multiple times, thus effectively making call/cc possible (unless I misunderstand call/cc). However that cloning function doesn't exist in Lua. Someone on the Lua IRC channel suggested that I use the Pluto library (it implements serialization) to marshal a coroutine, copy it and then unmarshal it and use it again. While that would probably work, I am more interested in the theoretical implementation of call/cc and in finding what is the actual minimum set of features that a language needs to have in order to allow for its manual implementation.

Read the article
Practical size limitations for RDBMS

- by grenade

I am working on a project that must store very large datasets and associated reference data. I have never come across a project that required tables quite this large. I have proved that at least one development environment cannot cope at the database tier with the processing required by the complex queries against views that the application layer generates (views with multiple inner and outer joins, grouping, summing and averaging against tables with 90 million rows). The RDBMS that I have tested against is DB2 on AIX. The dev environment that failed was loaded with 1/20th of the volume that will be processed in production. I am assured that the production hardware is superior to the dev and staging hardware but I just don't believe that it will cope with the sheer volume of data and complexity of queries. Before the dev environment failed, it was taking in excess of 5 minutes to return a small dataset (several hundred rows) that was produced by a complex query (many joins, lots of grouping, summing and averaging) against the large tables. My gut feeling is that the db architecture must change so that the aggregations currently provided by the views are performed as part of an off-peak batch process. Now for my question. I am assured by people who claim to have experience of this sort of thing (which I do not) that my fears are unfounded. Are they? Can a modern RDBMS (SQL Server 2008, Oracle, DB2) cope with the volume and complexity I have described (given an appropriate amount of hardware) or are we in the realm of technologies like Google's BigTable? I'm hoping for answers from folks who have actually had to work with this sort of volume at a non-theoretical level.

Read the article
Figuring out the performance limitation of an ADC on a PIC microcontroller

- by AKE

I'm spec-ing the suitability of a microcontroller like PIC for an analog-to-digital application. This would be preferable to using external A/D chips. To do that, I've had to run through some computations, pulling the relevant parameters from the datasheets. I'm not sure I've got it right -- would appreciate a check! Here's the simplest example: PIC10F220 is the simplest possible PIC with an ADC. Runs at clock speed of 8MHz. Has an instruction cycle of 0.5us (4 clock steps per instruction) So: Taking Tacq = 6.06 us (acquisition time for ADC, assume chip temp. = 50*C) [datasheet p34] Taking Fosc = 8MHz (? clock speed) Taking divisor = 4 (4 clock steps per CPU instruction) This gives TAD = 0.5us (TAD = 1/(Fosc/divisor) ) Conversion time is 13*TAD [datasheet p31] This gives conversion time 6.5us ADC duration is then 12.56 us [? Tacq + 13*TAD] Assuming at least 2 instructions for load/store: This is another 1 us [0.5 us per instruction] Which would give max sampling rate of 73.7 ksps (1/13.56) Supposing 8 more instructions for real-time processing: This is another 4 us Thus, total ADC/handling time = 17.56us (12.56us + 1us + 4us) So expected upper sampling rate is 56.9 ksps. Nyquist frequency for this sampling rate is therefore 28 kHz. If this is right, it suggests the (theoretical) performance suitability of this chip's A/D is for signals that are bandlimited to 28 kHz. Is this a correct interpretation of the information given in the data sheet? Any pointers would be much appreciated! AKE

Read the article
Validation in n-tier asp.net mvc applications

- by sTodorov

Dear Stack Overflow gurus, I am looking for some practical/theoretical information regarding best practices for validation in asp.net mvc n-tier applications. I am working on a .Net application divided into the following layers: UI - Mvc3 BLL layer - all business rules. Decoupled from data access and UI layers through interfaces DAL layer - Data access with the repository pattern, EF4 and pocos Now, I am looking for a nice, clean and transparent way to specify my validation rules. Here are some thoughts on the matter so far: UI validation should only be responsible for user input and its validity. BLL validation should be handling the validity of the data regarding the application business rules. My main concern is how to bind the BLL and UI validation in the most efficient way. One think I am would like to avoid is having the UI check in a collection of validation and adding manually errors to the ModelState. Furthermore, I do not want to pass the ModelState to the BLL to be populated in there. I will appreciate any thoughts on the matter. P.S. Should this question be marked as a discussion ?

Read the article

Search Results

Search found 351 results on 15 pages for 'theoretical'.

Page 11/15 | < Previous Page | 7 8 9 10 11 12 13 14 15 | Next Page >

- by Zephyr Pellerin

- by Montecristo

- by NoozNooz42

- by TomTom

- by wanson

- by Jared

- by Andrew

- by Zac B

- by Zom-B

- by user117339

- by El Yobo

- by jonescb

- by Stephen

- by ACShorten

- by Ruud

- by Brett Ryan

- by dennmat

- by Overflown

- by Zenzen

- by darren

- by reinier

- by Pessimist

- by grenade

- by AKE

- by sTodorov

< Previous Page | 7 8 9 10 11 12 13 14 15 | Next Page >