Search Results

Search found 260 results on 11 pages for 'albeit'.

Page 9/11 | < Previous Page | 5 6 7 8 9 10 11 | Next Page >

swapping or thrashing with vast amounts of unmapped pagecache

- by Marco

EDIT: I noticed that this is more appropriate for superuser.com, I apologize. I don't know how to delete this question. I'm using kubuntu jaunty (i386 32bit), kernel 2.6.28-13-generic. I've 4Gb of RAM, of which only 3317Mb are seen by the system (I guess because of the 32bit system). I'm seeing that the pagecache utilization is continually growing, up to the point that the system is unusable (after a few days). This happens also when I don't do anything (all user applications closed and the bare minimum of services enabled). If enabled, the system starts to use swap space (using it all in the end). Even if swap is disabled, disk activity becomes continuous, with the system unresponsive. For example, right now the system is working (albeit a tad slow), with only firefox and wing ide running, and I have 2Gb cached with only 45Mb mapped: $ free total used free shared buffers cached Mem: 3346388 3247328 99060 0 8416 2117980 -/+ buffers/cache: 1120932 2225456 Swap: 2144668 519448 1625220 $ cat /proc/meminfo MemTotal: 3346388 kB MemFree: 97128 kB Buffers: 7872 kB Cached: 2120224 kB SwapCached: 413860 kB Active: 2304596 kB Inactive: 865984 kB Active(anon): 2279168 kB Inactive(anon): 830236 kB Active(file): 25428 kB Inactive(file): 35748 kB Unevictable: 32 kB Mlocked: 32 kB HighTotal: 2492940 kB HighFree: 5456 kB LowTotal: 853448 kB LowFree: 91672 kB SwapTotal: 2144668 kB SwapFree: 1625244 kB Dirty: 84 kB Writeback: 0 kB AnonPages: 629304 kB Mapped: 45768 kB Slab: 45600 kB SReclaimable: 21756 kB SUnreclaim: 23844 kB PageTables: 4468 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 3817860 kB Committed_AS: 3735020 kB VmallocTotal: 122880 kB VmallocUsed: 9352 kB VmallocChunk: 66600 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 4096 kB DirectMap4k: 16376 kB DirectMap4M: 888832 kB If I try to drop the caches, little happes: # sync ; echo 3 > /proc/sys/vm/drop_caches ; free total used free shared buffers cached Mem: 3346388 3220580 125808 0 3020 2100600 -/+ buffers/cache: 1116960 2229428 Swap: 2144668 519356 1625312 Right now I've vm.swappiness = 5, but I've tried also with 0 and 1 (without noticeable differences). I've also tried vm.vfs_cache_pressure = 50 and 150 (again, no differences). As I said the pagecache eats all memory even with swapping turned off. What is happening? How to avoid this? TIA, Marco

Read the article
Multi-Role Domain Controllers for Small Offices (< 50 clients)

- by kce

Warning: I'm a Linux/*NIX admin so this is all new to me. I understand that it's not considered a good idea to have only a single domain controller, and that it is also probably a good idea for a domain controller to only do AD/DHCP/DNS (Here). We have two offices, location A with 30 users and location B with 10 users. Our two offices are separated by a WAN that is not particularly robust so I have be instructed that we need to have standalone services in each office. This means that according to "best practices" we will need to build a domain controller and a separate file server in each office. Again, I am not knowledgeable in the ways of Windows but this seems a little unnecessary for an organization of 40 users. People have commented that I could "get away with" running file services on the domain controller as long as the "load is light". That just seems to generate more questions than it answers. What constitutes light load? What are the potential consequences of mixing these roles? Ideally I would prefer to only have one physical machine at each location. The one in location A (the location with IT staff) can act as the primary domain controller and the one in the smaller office can act as the backup domain controller. If either domain controller fails we can still use the other one for authentication (albeit with some latency) and if the WAN connection fails each office still has access to their respective "local" domain controller. If the file services are ALSO run on each server (and synchronized with something like DFS), a similar arrangement in terms of redundancy can be had without having to purchase, build and install two additional separate servers. It's not that I'm adverse to that (well, any more adverse than I am to whole thing to begin with) but to my simple mind it just seems, well a bit overkill. I can definitely see the benefits of functional separation when we're talking larger organizations, but I need to consider the additional overhead too. None of this excludes having a DRP setup for the domain controller/s. I assume you can lose two domain controllers just as easily as one.

Read the article
Stack-based keyboard delay using Logitech MX3100 keyboard

- by Mark S. Rasmussen

I've been using a Logitech Cordless Desktop MX3100 keyboard for quite a while. I've never really had any problems, except for the occasional typo. I noticed however that I tended make the typo "Laod" instead of "Load", quite a bit more often than any other typos. As it started to get on my nerves, I decided to do some testing. What I found out was than when I write lowercase "load", I'd never make the typo. All uppercase, or just uppercase L, I'd make the typo quite often. My actual (very scientific) testing is probably best described by showing the output: moatmoatmoat MoatMoatMoat loatloatloat LaotLaotLaot loafloafloaf LaofLaofLaof hoathoathoat HoatHoatHoat hoadhoadhoad HoadHoadHoad lortlortlort LrotLrotLrot What i found out was that whenever shift was depressed, typing an uppercase "L" would induce a significant lag if the next character was an "o", compared to the lag of the any other key: High "o" lag: LoLoLoLoLoLo No "a" lag: LaLaLaLaLaLa No lag for neither "o" nor "a": lolololololo lalalalalala By realizing this I regained a slight bit of sanity as I knew I wasn't coming down with a case of Parkinsons. I was actually typing correctly, the lag just interpreted it wrongly. Now, what really bugs me is that I can't fathom how this is occurring. What I'm actually typing, in physical order, is this: L - o - a - d, and yet, the "a" is output before the "o", even though "o" was pressed before "a". So while the keyboard is processing the "Lo" combo, the "a" gets prioritized and is inserted before the "o" is done processing, resulting in Laod instead of Load. And this only happens when typing "Lo", not when typing lowercase "lo". This problem could stem from the keyboard hardware, the receiver hardware or the keyboard software driver. No matter the fault location however, I can't imagine how this could be implemented as anything but a FIFO queue. A general delay, sure, I could live with that, albeit I'd be irritated. But a lag affecting different keys differently, and even resulting in unpredictable outcome - that just doesn't make any sense. I've solved the problem by just switching to a wired keyboard. I just can't shake it off me though; what kind of bug/error/scenario would result in a case like this? Edit: It's been suggested that I stop drinking Red Bull and stick to water instead. While that may actually help solve the issue, I'm really not looking for a solution as such. I'm more interested in an explanation of how this could happen, as I can't imagine any viable technical solution that could result in this behavior.

Read the article
Backing Up vs. Redundancy

- by TK Kocheran

I'm currently in stage 2 of 3 of building my home workstation. What this means is that my RAID-0 array of solid state disks will be backed up nightly to a RAID-5 or RAID-6 array of traditional spinning hard disks. However, it recently dawned on me that redundancy is not backup. The main reason for setting up a RAID array with redundancy was to protect myself in the event of a drive failure to serve as an effective backup solution. Wait. What if a bolt of lightning finds a way to travel into my house, through my surge-protector, into my power supply and physically destroys all of my hard disks and SSDs? Well, in that case, I guess I'd be fine because I generally keep most important files (music, pictures, videos) stored in multiple places like on my laptop, my wife's laptop, and an encrypted USB hard drive. Wait. What if a giant hedgehog meteor attacks my house from space traveling at mach 3 and all machines and hard disks are blown to smithereens. Well, I guess I could find a way to do ridiculously slow and cumbersome rsyncs or backups to Amazon's Glacier. Wait. What if there's a nuclear apocalypse... and at this point I start laughing hysterically. At what point does backing up become irrelevant? I completely understand situation one (mechanical drive failure), situation two (workstation compromised or destroyed somehow), possibly even situation three (all machines and disks destroyed), but situation four? There's no questioning the need for backups. None. However, there are three questions I'd really like addressed: To what level should one backup? I definitely understand the merits of physical disk redundancy. I also believe in keeping important files on multiple machines and thinning out the possibility of losing all of my files. Online backups make sense, but they beg the following question. What should I be backing up remotely and how often? It's no problem storage-wise to back up important files (music, pictures, videos) and even configuration and temporal data for all of the machines in my network (all Linux based)... albeit locally. Transferring to the cloud is another story. Worst-case scenario, if I lost all of my configuration for my individual computers, the reality is that I probably lost the machines too. The cloud is a long way away from here; I can run backups over CAT-6 here and see 100MB/s easily, but I'm afraid that I'm only going to see 2MB/s at best when transferring up to the cloud.

Read the article
load-causing processes disappearing from "top" ps -o pcpu shows bogus numbers

- by Alec Matusis

I administer a large number of servers, and I have this problem only with Ubuntu 10.04 LTS: I run a server under normal load (say load average 3.0 on an 8-core server). The "top" command shows processes taking certain % of CPU that cause this load average: say PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11008 mysql 20 0 25.9g 22g 5496 S 67 76.0 643539:38 mysqld ps -o pcpu,pid -p11008 %CPU PID 53.1 11008 , everything is consistent. The all of the sudden, the process causing the load average disappears from "top", but the process continues to run normally (albeit with a slight performance decrease), and the system load average becomes somewhat higher. The output of ps -o pcpu becomes bogus: # ps -o pcpu,pid -p11008 %CPU PID 317910278 1587 This happened to at least 5 different severs (different brand new IBM System X hardware), each running different software: one httpd 2.2, one mysqld 5.1, and one Twisted Python TCP servers. Each time the kernel was between 2.6.32-32-server and 2.6.32-40-server. I updated some machines to 2.6.32-41-server, and it has not happened on those yet, but the bug is rare (once every 60 days or so). This is from an affected machine: top - 10:39:06 up 73 days, 17:57, 3 users, load average: 6.62, 5.60, 5.34 Tasks: 207 total, 2 running, 205 sleeping, 0 stopped, 0 zombie Cpu(s): 11.4%us, 18.0%sy, 0.0%ni, 66.3%id, 4.3%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 74341464k total, 71985004k used, 2356460k free, 236456k buffers Swap: 3906552k total, 328k used, 3906224k free, 24838212k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 805 root 20 0 0 0 0 S 3 0.0 1493:09 fct0-worker 982 root 20 0 0 0 0 S 1 0.0 111:35.05 fioa-data-groom 914 root 20 0 0 0 0 S 0 0.0 884:42.71 fct1-worker 1068 root 20 0 19364 1496 1060 R 0 0.0 0:00.02 top Nothing causing high load is showing on top, but I have two highly loaded mysqld instances on it, that suddenly show crazy %CPU: #ps -o pcpu,pid,cmd -p1587 %CPU PID CMD 317713124 1587 /nail/encap/mysql-5.1.60/libexec/mysqld and #ps -o pcpu,pid,cmd -p1624 %CPU PID CMD 2802 1624 /nail/encap/mysql-5.1.60/libexec/mysqld Here are the numbers from # cat /proc/1587/stat 1587 (mysqld) S 1212 1088 1088 0 -1 4202752 14307313 0 162 0 85773299069 4611685932654088833 0 0 20 0 52 0 3549 27255418880 5483524 18446744073709551615 4194304 11111617 140733749236976 140733749235984 8858659 0 552967 4102 26345 18446744073709551615 0 0 17 5 0 0 0 0 0 the 14th and 15th numbers according to man proc are supposed to be utime %lu Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK). This includes guest time, guest_time (time spent running a virtual CPU, see below), so that applications that are not aware of the guest time field do not lose that time from their calculations. stime %lu Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK). On a normal server, these numbers are advancing, every time I check the /proc/PID/stat. On a buggy server, these numbers are stuck at a ridiculously high value like 4611685932654088833, and it's not changing. Has anyone encountered this bug?

Read the article
VirtualBox Port Forward not working when Guest IP *IS* specified (while doc says opposite)

- by Patrick

Trying to port forward from host (Mac OS X) 127.0.0.1:8282 - guest (CentOS)'s 10.10.10.10:8080. Existing port forwards include 127.0.0.1:8181 and 9191 to guest without any IP specified (so whatever it gets through DHCP, as explained in the documentation). Here is how the non-working binding was added: VBoxManage modifyvm "VM name" --natpf1 "rule3,tcp,127.0.0.1,8282,10.10.10.10,8080" Here is how the working ones were added: VBoxManage modifyvm "VM name" --natpf1 "rule1,tcp,127.0.0.1,8181,,80" VBoxManage modifyvm "VM name" --natpf1 "rule2,tcp,127.0.0.1,9191,,9090" And by "non-working", I of course mean not listening (as a prerequisite to forwarding): $ lsof -Pi -n|grep Virtual|grep LISTEN VirtualBo 27050 user 21u IPv4 0x2bbdc68fd363175d 0t0 TCP 127.0.0.1:9191 (LISTEN) VirtualBo 27050 user 22u IPv4 0x2bbdc68fd0e0af75 0t0 TCP 127.0.0.1:8181 (LISTEN) There should be a similar line above but with 127.0.0.1:8282. Just to be clear, this port is listening perfectly fine on the guest itself. And when I remove the guest IP (i.e., clear the 10.10.10.10) the forward works fine, albeit to eth0 (not eth1 where I need it). I can tcpdump and watch the traffic flow back and forth. And yes, I've disabled iptables entirely while testing -- it's not getting blocked anywhere on the guest. As VirtualBox writes in their documentation, you are required to specify the guest IP if it's static (makes sense, no DHCP record it keeps): "If for some reason the guest uses a static assigned IP address not leased from the built-in DHCP server, it is required to specify the guest IP when registering the forwarding rule:". However, doing so (as I need to), seems to break the port forward with nary a report in any log file I can find. (I've reviewed everything in ~/Library/VirtualBox/). Other notes: While I used the above command to add the third rule, I've also verified it showed up correctly in GUI and then removed/re-added from there just to make sure). This forum link -- while very dated -- looks somewhat related in that a port forward to a static IP was not appearing (perhaps they think due to lack of gratuitous arp being sent for host to know IP is there/avail?). Anyway, what gives? Is this still buggy? Any suggestions? If not, easy enough workarounds? What's interesting is that this works perfectly fine on another user's Mac, however he's running a slightly older version (4.3.6 v. 4.3.12).

Read the article
GLSL Atmospheric Scattering Issue

- by mtf1200

I am attempting to use Sean O'Neil's shaders to accomplish atmospheric scattering. For now I am just using SkyFromSpace and GroundFromSpace. The atmosphere works fine but the planet itself is just a giant dark sphere with a white blotch that follows the camera. I think the problem might rest in the "v3Attenuation" variable as when this is removed the sphere is show (albeit without scattering). Here is the vertex shader. Thanks for the time! uniform mat4 g_WorldViewProjectionMatrix; uniform mat4 g_WorldMatrix; uniform vec3 m_v3CameraPos; // The camera's current position uniform vec3 m_v3LightPos; // The direction vector to the light source uniform vec3 m_v3InvWavelength; // 1 / pow(wavelength, 4) for the red, green, and blue channels uniform float m_fCameraHeight; // The camera's current height uniform float m_fCameraHeight2; // fCameraHeight^2 uniform float m_fOuterRadius; // The outer (atmosphere) radius uniform float m_fOuterRadius2; // fOuterRadius^2 uniform float m_fInnerRadius; // The inner (planetary) radius uniform float m_fInnerRadius2; // fInnerRadius^2 uniform float m_fKrESun; // Kr * ESun uniform float m_fKmESun; // Km * ESun uniform float m_fKr4PI; // Kr * 4 * PI uniform float m_fKm4PI; // Km * 4 * PI uniform float m_fScale; // 1 / (fOuterRadius - fInnerRadius) uniform float m_fScaleDepth; // The scale depth (i.e. the altitude at which the atmosphere's average density is found) uniform float m_fScaleOverScaleDepth; // fScale / fScaleDepth attribute vec4 inPosition; vec3 v3ELightPos = vec3(g_WorldMatrix * vec4(m_v3LightPos, 1.0)); vec3 v3ECameraPos= vec3(g_WorldMatrix * vec4(m_v3CameraPos, 1.0)); const int nSamples = 2; const float fSamples = 2.0; varying vec4 color; float scale(float fCos) { float x = 1.0 - fCos; return m_fScaleDepth * exp(-0.00287 + x*(0.459 + x*(3.83 + x*(-6.80 + x*5.25)))); } void main(void) { gl_Position = g_WorldViewProjectionMatrix * inPosition; // Get the ray from the camera to the vertex and its length (which is the far point of the ray passing through the atmosphere) vec3 v3Pos = vec3(g_WorldMatrix * inPosition); vec3 v3Ray = v3Pos - v3ECameraPos; float fFar = length(v3Ray); v3Ray /= fFar; // Calculate the closest intersection of the ray with the outer atmosphere (which is the near point of the ray passing through the atmosphere) float B = 2.0 * dot(m_v3CameraPos, v3Ray); float C = m_fCameraHeight2 - m_fOuterRadius2; float fDet = max(0.0, B*B - 4.0 * C); float fNear = 0.5 * (-B - sqrt(fDet)); // Calculate the ray's starting position, then calculate its scattering offset vec3 v3Start = m_v3CameraPos + v3Ray * fNear; fFar -= fNear; float fDepth = exp((m_fInnerRadius - m_fOuterRadius) / m_fScaleDepth); float fCameraAngle = dot(-v3Ray, v3Pos) / fFar; float fLightAngle = dot(v3ELightPos, v3Pos) / fFar; float fCameraScale = scale(fCameraAngle); float fLightScale = scale(fLightAngle); float fCameraOffset = fDepth*fCameraScale; float fTemp = (fLightScale + fCameraScale); // Initialize the scattering loop variables float fSampleLength = fFar / fSamples; float fScaledLength = fSampleLength * m_fScale; vec3 v3SampleRay = v3Ray * fSampleLength; vec3 v3SamplePoint = v3Start + v3SampleRay * 0.5; // Now loop through the sample rays vec3 v3FrontColor = vec3(0.0, 0.0, 0.0); vec3 v3Attenuate; for(int i=0; i<nSamples; i++) { float fHeight = length(v3SamplePoint); float fDepth = exp(m_fScaleOverScaleDepth * (m_fInnerRadius - fHeight)); float fScatter = fDepth*fTemp - fCameraOffset; v3Attenuate = exp(-fScatter * (m_v3InvWavelength * m_fKr4PI + m_fKm4PI)); v3FrontColor += v3Attenuate * (fDepth * fScaledLength); v3SamplePoint += v3SampleRay; } vec3 first = v3FrontColor * (m_v3InvWavelength * m_fKrESun + m_fKmESun); vec3 secondary = v3Attenuate; color = vec4((first + vec3(0.25,0.25,0.25) * secondary), 1.0); // ^^ that color is passed to the frag shader and is used as the gl_FragColor } Here is also an image of the problem image

Read the article
Video games, content strategy, and failure - oh my.

- by Roger Hart

Last night was the CS London group's event Content Strategy, Manhattan Style. Yes, it's a terrible title, feeling like a self-conscious grasp for chic, sadly commensurate with the venue. Fortunately, this was not commensurate with the event itself, which was lively, relevant, and engaging. Although mostly if you're a consultant. This is a strong strain in current content strategy discourse, and I think we're going to see it remedied quite soon. Not least in Paris on Friday. A lot of the bloggers, speakers, and commentators in the sphere are consultants, or part of agencies and other consulting organisations. A lot of the talk is about how you sell content strategy to your clients. This is completely acceptable. Of course it is. And it's actually useful if that's something you regularly have to do. To an extent, it's even portable to those of us who have to sell content strategy within an organisation. We're still competing for credibility and resource. What we're doing less is living in the beginning of a project. This was touched on by Jeffrey MacIntyre (albeit in a your-clients kind of a way) who described "the day two problem". Companies, he suggested, build websites for launch day, and forget about the need for them to be ongoing entities. Consultants, agencies, or even internal folks on short projects will live through Day Two quite often: the trainwreck moment where somebody realises that even if the content is right (which it often isn't), and on time (which it often isn't), it'll be redundant, outdated, or inaccurate by the end of the week/month/fickle social media attention cycle. The thing about living through a lot of Day Two is that you see a lot of failure. Nothing succeeds like failure? Failure is good. When it's structured right, it's an awesome tool for learning - that's kind of how video games work. I'm chewing over a whole blog post about this, but basically in game-like learning, you try, fail, go round the loop again. Success eventually yields joy. It's a relatively well-known phenomenon. It works best when that failing step is acutely felt, but extremely inexpensive. Dying in Portal is highly frustrating and surprisingly characterful, but the save-points are well designed and the reload unintrusive. The barrier to re-entry into the loop is very low, as is the cost of your failure out in meatspace. So it's easy (and fun) to learn. Yeah, spot the difference with business failure. As an external content strategist, you get to rock up with a big old folder full of other companies' Day Two (and ongoing day two hundred) failures. You can't send the client round the learning loop - although you may well be there because they've been round it once - but you can show other people's round trip. It's not as compelling, but it's not bad. What about internal content strategists? We can still point to things that are wrong, and there are some very compelling tools at our disposal - content inventories, user testing, and analytics, for instance. But if we're picking up big organically sprawling legacy content, Day Two may well be a distant memory, and the felt experience of web content failure is unlikely to be immediate to many people in the organisation. What to do? My hunch here is that the first task is to create something immediate and felt, but that it probably needs to be a success. Something quickly doable and visible - a content problem solved with a measurable business result. Now, that's a tall order; but scrape of the "quickly" and it's the whole reason we're here. At Red Gate, I've started with the text book fear and passion introduction to content strategy. In fact, I just typo'd that as "contempt strategy", and it isn't a bad description. Yelling "look at this, our website is rubbish!" gets you the initial attention, but it doesn't make you many friends. And if you don't produce something pretty sharp-ish, it's easy to lose the momentum you built up for change. The first thing I've done - after the visual content inventory - is to delete a bunch of stuff. About 70% of the SQL Compare web content has gone, in fact. This is a really, really cheap operation. It's visible, and it's powerful. It's cheap because you don't have to create any new content. It's not free, however, because you do have to validate your deletions. This means analytics, actually reading that content, and talking to people whose business purposes that content has to serve. If nobody outside the company uses it, and nobody inside the company thinks they ought to, that's a no-brainer for the delete list. The payoff here is twofold. There's the nebulous hard-to-illustrate "bad content does user experience and brand damage" argument; and there's the "nobody has to spend time (money) maintaining this now" argument. One or both are easily felt, and the second at least should be measurable. But that's just one approach, and I'd be interested to hear from any other internal content strategy folks about how they get buy-in, maintain momentum, and generally get things done.

Read the article
Which of these algorithms is best for my goal?

- by JonathonG

I have created a program that restricts the mouse to a certain region based on a black/white bitmap. The program is 100% functional as-is, but uses an inaccurate, albeit fast, algorithm for repositioning the mouse when it strays outside the area. Currently, when the mouse moves outside the area, basically what happens is this: A line is drawn between a pre-defined static point inside the region and the mouse's new position. The point where that line intersects the edge of the allowed area is found. The mouse is moved to that point. This works, but only works perfectly for a perfect circle with the pre-defined point set in the exact center. Unfortunately, this will never be the case. The application will be used with a variety of rectangles and irregular, amorphous shapes. On such shapes, the point where the line drawn intersects the edge will usually not be the closest point on the shape to the mouse. I need to create a new algorithm that finds the closest point to the mouse's new position on the edge of the allowed area. I have several ideas about this, but I am not sure of their validity, in that they may have far too much overhead. While I am not asking for code, it might help to know that I am using Objective C / Cocoa, developing for OS X, as I feel the language being used might affect the efficiency of potential methods. My ideas are: Using a bit of trigonometry to project lines would work, but that would require some kind of intense algorithm to test every point on every line until it found the edge of the region... That seems too resource intensive since there could be something like 200 lines that would have each have to have as many as 200 pixels checked for black/white.... Using something like an A* pathing algorithm to find the shortest path to a black pixel; however, A* seems resource intensive, even though I could probably restrict it to only checking roughly in one direction. It also seems like it will take more time and effort than I have available to spend on this small portion of the much larger project I am working on, correct me if I am wrong and it would not be a significant amount of code (100 lines or around there). Mapping the border of the region before the application begins running the event tap loop. I think I could accomplish this by using my current line-based algorithm to find an edge point and then initiating an algorithm that checks all 8 pixels around that pixel, finds the next border pixel in one direction, and continues to do this until it comes back to the starting pixel. I could then store that data in an array to be used for the entire duration of the program, and have the mouse re-positioning method check the array for the closest pixel on the border to the mouse target position. That last method would presumably execute it's initial border mapping fairly quickly. (It would only have to map between 2,000 and 8,000 pixels, which means 8,000 to 64,000 checked, and I could even permanently store the data to make launching faster.) However, I am uncertain as to how much overhead it would take to scan through that array for the shortest distance for every single mouse move event... I suppose there could be a shortcut to restrict the number of elements in the array that will be checked to a variable number starting with the intersecting point on the line (from my original algorithm), and raise/lower that number to experiment with the overhead/accuracy tradeoff. Please let me know if I am over thinking this and there is an easier way that will work just fine, or which of these methods would be able to execute something like 30 times per second to keep mouse movement smooth, or if you have a better/faster method. I've posted relevant parts of my code below for reference, and included an example of what the area might look like. (I check for color value against a loaded bitmap that is black/white.) // // This part of my code runs every single time the mouse moves. // CGPoint point = CGEventGetLocation(event); float tX = point.x; float tY = point.y; if( is_in_area(tX,tY, mouse_mask)){ // target is inside O.K. area, do nothing }else{ CGPoint target; //point inside restricted region: float iX = 600; // inside x float iY = 500; // inside y // delta to midpoint between iX,iY and tX,tY float dX; float dY; float accuracy = .5; //accuracy to loop until reached do { dX = (tX-iX)/2; dY = (tY-iY)/2; if(is_in_area((tX-dX),(tY-dY),mouse_mask)){ iX += dX; iY += dY; } else { tX -= dX; tY -= dY; } } while (abs(dX)>accuracy || abs(dY)>accuracy); target = CGPointMake(roundf(tX), roundf(tY)); CGDisplayMoveCursorToPoint(CGMainDisplayID(),target); } Here is "is_in_area(int x, int y)" : bool is_in_area(NSInteger x, NSInteger y, NSBitmapImageRep *mouse_mask){ NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init]; NSUInteger pixel[4]; [mouse_mask getPixel:pixel atX:x y:y]; if(pixel[0]!= 0){ [pool release]; return false; } [pool release]; return true; }

Read the article
Is Data Science “Science”?

- by BuckWoody

I hold the term “science” in very high esteem. I grew up on the Space Coast in Florida, and eventually worked at the Kennedy Space Center, surrounded by very intelligent people who worked in various scientific fields. Recently a new term has entered the computing dialog – “Data Scientist”. Since it’s not a standard term, it has a lot of definitions, and in fact has been disputed as a correct term. After all, the reasoning goes, if there’s no such thing as “Data Science” then how can there be a Data Scientist? This argument has been made before, albeit with a different term – “Computer Science”. In Peter Denning’s excellent article “Is Computer Science Science” (April 2005/Vol. 48, No. 4 COMMUNICATIONS OF THE ACM) there are many points that separate “science” from “engineering” and even “art”. I won’t repeat the content of that article here (I recommend you read it on your own) but will leverage the points he makes there. Definition of Science To ask the question “is data science ‘science’” then we need to start with a definition of terms. Various references put the definition into the same basic areas: Study of the physical world Systematic and/or disciplined study of a subject area ...and then they include the things studied, the bodies of knowledge and so on. The word itself comes from Latin, and means merely “to know” or “to study to know”. Greek divides knowledge further into “truth” (episteme), and practical use or effects (tekhne). Normally computing falls into the second realm. Definition of Data Science And now a more controversial definition: Data Science. This term is so new and perhaps so niche that the major dictionaries haven’t yet picked it up (my OED reference is older – can’t afford to pop for the online registration at present). Researching the term's general use I created an amalgam of the definitions this way: “Studying and applying mathematical and other techniques to derive information from complex data sets.” Using this definition, data science certainly seems to be science - it's learning about and studying some object or area using systematic methods. But implicit within the definition is the word “application”, which makes the process more akin to engineering or even technology than science. In fact, I find that using these techniques – and data itself – part of science, not science itself. I leave out the concept of studying data patterns or algorithms as part of this discipline. That is actually a domain I see within research, mathematics or computer science. That of course is a type of science, but does not seek for practical applications. As part of the argument against calling it “Data Science”, some point to the scientific method of creating a hypothesis, testing with controls, testing results against the hypothesis, and documenting for repeatability. These are not steps that we often take in working with data. We normally start with a question, and fit patterns and algorithms to predict outcomes and find correlations. In this way Data Science is more akin to statistics (and in fact makes heavy use of them) in the process rather than starting with an assumption and following on with it. So, is Data Science “Science”? I’m uncertain – and I’m uncertain it matters. Even if we are facing rampant “title inflation” these days (does anyone introduce themselves as a secretary or supervisor anymore?) I can tolerate the term at least from the intent that we use data to study problems across a wide spectrum, rather than restricting it to a single domain. And I also understand those who have worked hard to achieve the very honorable title of “scientist” who have issues with those who borrow the term without asking. What do you think? Science, or not? Does it matter?

Read the article
Rules for Naming

- by PointsToShare

© 2011 By: Dov Trietsch. All rights reserved Naming Documents (or is it “Document, Naming”?) Tis but thy name that is my enemy; Thou art thyself, though not a Montague. What's Montague? It is nor hand, nor foot, Nor arm, nor face, nor any other part Belonging to a man. O, be some other name! What's in a name? That which we call a rose By any other name would smell as sweet; So Romeo would, were he not Romeo call'd, Retain that dear perfection which he owes Without that title. Romeo, doff thy name And for that name which is no part of thee Take all myself. Shakespeare – Romeo and Juliet Act II, Scene 2 We normally only use the bold portion of the famous Shakespearean quote above, but it is really out of context. As the play unfolds, we learn that a name is all too powerful. Indeed it is because of their names that the doomed lovers die. There might be life and death in a name (BTW, when I wrote this monogram, I was in Hatfield, PA. Remember the Hatfields and the McCoys?) This is a bit extreme, but in the field of Knowledge Management (KM) names are of the utmost importance as well. When I write an article about managing SharePoint sites, how should I name it? “Managing a site” or “Site, managing”? Nine times out of ten I’d opt for the latter. Almost everything we do is “Managing” so to make life easier for a person looking for meaningful content, we title our articles starting with the differentiator rather than the common factor. As a rule of thumb, we start the name with the noun rather than the verb. It is not what we do that is the primary key; it is what we do it to. So, answer this – is it a “rule of thumb” or a “thumb rule?” This is tough. A lot of what we do when naming is a judgment call. Both thumb and rule are nouns, albeit concrete and abstract (more about this later), but to most people “thumb rule” is meaningless while “rule of thumb” is an idiom. The difference between knowledge and information is that knowledge is meaningful information placed in context. Thus I elect the “rule of thumb”. It is the more meaningful title. Abstract and Concrete are relative terms. Many nouns (and verbs) that are abstract to a commoner, are concrete to a practitioner of one profession or another and may even have different concrete meanings in different professional jargons. Think about “running”. To an executive it means running a business, to a marathoner its meaning is much more literal. Generally speaking, we store and disseminate knowledge within a practice more than we do it in general. Even dictionaries encyclopedias define terms as they apply to different audiences. The rule of thumb is to put the more concrete first, but within the audience’s jargon. Even the title of this monogram is a question. Do I name it “Naming Documents” or “Documents, Naming”? Well, my own rule of thumb (“Here he goes again!?”) states that the latter is better because it starts with a noun, but this is a document about naming more than it about documents. The rules of naming also apply to graphs and charts, excel spreadsheets, and so on. Thus, I vote for the former. A better title could have been “Naming Objects” only the word “Object” is a bit too abstract. How about just “Naming” or “Naming, rules of”? You get the drift. One of the ways to resolve all of this is to store the documents in Knowledge-Bases, which may become the subjects of a future punditry. Knowledge bases use keywords to describe their content. Use a Metadata store for the keywords to at least attempt some common grounds. Here is another general rule (rule of thumb?!!) – put at least the one keyword in the title. Use subtitles. Here is an example: Migrating documents – Screening, cleaning, and organizing our knowledge. The main keyword is “documents”, next is “migrating”, other keywords also appear in the subtitle. They are “screening”, “cleaning”, and “organizing”. Any questions? Send me an amply named document by email: [email protected]

Read the article
The design of a generic data synchronizer, or, an [object] that does [actions] with the aid of [helpers]

- by acheong87

I'd like to create a generic data-source "synchronizer," where data-source "types" may include MySQL databases, Google Spreadsheets documents, CSV files, among others. I've been trying to figure out how to structure this in terms of classes and interfaces, keeping in mind (what I've read about) composition vs. inheritance and is-a vs. has-a, but each route I go down seems to violate some principle. For simplicity, assume that all data-sources have a header-row-plus-data-rows format. For example, assume that the first rows of Google Spreadsheets documents and CSV files will have column headers, a.k.a. "fields" (to parallel database fields). Also, eventually, I would like to implement this in PHP, but avoiding language-specific discussion would probably be more productive. Here's an overview of what I've tried. Part 1/4: ISyncable class CMySQL implements ISyncable GetFields() // sql query, pdo statement, whatever AddFields() RemFields() ... _dbh class CGoogleSpreadsheets implements ISyncable GetFields() // zend gdata api AddFields() RemFields() ... _spreadsheetKey _worksheetId class CCsvFile implements ISyncable GetFields() // read from buffer AddFields() RemFields() ... _buffer interface ISyncable GetFields() AddFields($field1, $field2, ...) RemFields($field1, $field2, ...) ... CanAddFields() // maybe the spreadsheet is locked for write, or CanRemFields() // maybe no permission to alter a database table ... AddRow() ModRow() RemRow() ... Open() Close() ... First Question: Does it make sense to use an interface, as above? Part 2/4: CSyncer Next, the thing that does the syncing. class CSyncer __construct(ISyncable $A, ISyncable $B) Push() // sync A to B Pull() // sync B to A Sync() // Push() and Pull() only differ in direction; factor. // Sync()'s job is to make sure that the fields on each side // match, to add fields where appropriate and possible, to // account for different column-orderings, etc., and of // course, to add and remove rows as necessary to sync. ... _A _B Second Question: Does it make sense to define such a class, or am I treading dangerously close to the "Kingdom of Nouns"? Part 3/4: CTranslator? ITranslator? Now, here's where I actually get lost, assuming the above is passable. Sometimes, two ISyncables speak different "dialects." For example, believe it or not, Google Spreadsheets (accessed through the Google Data API "list feed") returns column headers lower-cased and stripped of all spaces and symbols! That is, sys_TIMESTAMP is systimestamp, as far as my code can tell. (Yes, I am aware that the "cell feed" does not strip the name so; however cell-by-cell manipulation is too slow for what I'm doing.) One can imagine other hypothetical examples. Perhaps even the data itself can be in different "dialects." But let's take it as given for now, and not argue this if possible. Third Question: How would you implement "translation"? Note: Taking all this as an exercise, I'm more interested in the "idealized" design, rather than the practical one. (God knows that shipped sailed when I began this project.) Part 4/4: Further Thought Here's my train of thought to demonstrate I've thunk, albeit unfruitfully: First, I thought, primitively, "I'll just modify CMySQL::GetFields() to lower-case and strip field names so they're compatible with Google Spreadsheets." But of course, then my class should really be called, CMySQLForGoogleSpreadsheets, and that can't be right. So, the thing which translates must exist outside of an ISyncable implementor. And surely it can't be right to make each translation a method in CSyncer. If it exists outside of both ISyncable and CSyncer, then what is it? (Is it even an "it"?) Is it an abstract class, i.e. abstract CTranslator? Is it an interface, since a translator only does, not has, i.e. interface ITranslator? Does it even require instantiation? e.g. If it's an ITranslator, then should its translation methods be static? (I learned what "late static binding" meant, today.) And, dear God, whatever it is, how should a CSyncer use it? Does it "have" it? Is it, "it"? Who am I? ...am I, "I"? I've attempted to break up the question into sub-questions, but essentially my question is singular: How does one implement an object A that conceptually "links" (has) two objects b1 and b2 that share a common interface B, where certain pairs of b1 and b2 require a helper, e.g. a translator, to be handled by A? Something tells me that I've overcomplicated this design, or violated a principle much higher up. Thank you all very much for your time and any advice you can provide.

Read the article
Revisiting .NET, but what should I focus on?

- by Wayne M

After about a two-year hiatus, I'm brushing up on my .NET skills to find a .NET job (my previous two positions have very little development, or development using legacy technologies, so apart from a few very minor apps I have not touched .NET in close to two years). I'm aware of things like ASP.NET MVC, and I have previously read on things like NHibernate and DI/IOC, albeit I have yet to use them apart from very trivial "Hello World" type applications. I have a subscription to Rob Conery's Tekpub website and occasionally watch these videos when I have free time. My concern is this: I don't live in a very technical area. I would be surprised if any but the most tech-savvy companies have heard of, let alone use, ASP.NET MVC, NHibernate (or even LINQ/EF), or know about IoC. I would be willing to bet a large sum of money that 95% of the possible jobs I could obtain will use the following: Visual Source Safe, if any VCS at all ASP.NET 2.0 Webforms (3.5 if lucky) Raw ADO.NET on top of a very thin implementation of the Gateway pattern Stored Procedures in the database for most CRUD operations Gratuitous use of code-behind, with a Service layer if I'm lucky If I were extremely lucky, I might find a shop that has heard of ORMs and either uses one, or has wrote their own data abstraction. Also if I were lucky, the company would be using Model-View-Presenter. In light of this I'm not sure what I should focus on learning. Personally, I would prefer to be using the latest stuff - ASP.NET MVC, NHibernate, jQuery, WCF etc. Reality says I should go back to the basics, since it looks like most potential opportunities aren't going to be anywhere near the cutting edge, or anywhere close to it. And, as much as I would like to find a position and start to show the other developers the benefits, in my past experience this has usually resulted in my being fired for "not being a team player" and doing things the bad old way. So, I am curious how you would approach a situation like this? What should I focus on, in order to A) Reaquaint myself with .NET, and B) Prepare myself to obtain a .NET job again that is more than likely going to use techniques that I and most other knowledgeable developers will scoff at?

Read the article
Tactics for using PHP in a high-load site

- by Ross

Before you answer this I have never developed anything popular enough to attain high server loads. Treat me as (sigh) an alien that has just landed on the planet, albeit one that knows PHP and a few optimisation techniques. I'm developing a tool in PHP that could attain quite a lot of users, if it works out right. However while I'm fully capable of developing the program I'm pretty much clueless when it comes to making something that can deal with huge traffic. So here's a few questions on it (feel free to turn this question into a resource thread as well). Databases At the moment I plan to use the MySQLi features in PHP5. However how should I setup the databases in relation to users and content? Do I actually need multiple databases? At the moment everything's jumbled into one database - although I've been considering spreading user data to one, actual content to another and finally core site content (template masters etc.) to another. My reasoning behind this is that sending queries to different databases will ease up the load on them as one database = 3 load sources. Also would this still be effective if they were all on the same server? Caching I have a template system that is used to build the pages and swap out variables. Master templates are stored in the database and each time a template is called it's cached copy (a html document) is called. At the moment I have two types of variable in these templates - a static var and a dynamic var. Static vars are usually things like page names, the name of the site - things that don't change often; dynamic vars are things that change on each page load. My question on this: Say I have comments on different articles. Which is a better solution: store the simple comment template and render comments (from a DB call) each time the page is loaded or store a cached copy of the comments page as a html page - each time a comment is added/edited/deleted the page is recached. Finally Does anyone have any tips/pointers for running a high load site on PHP. I'm pretty sure it's a workable language to use - Facebook and Yahoo! give it great precedence - but are there any experiences I should watch out for? Thanks, Ross

Read the article
Best practices concerning view model and model updates with a subset of the fields

- by Martin

By picking MVC for developing our new site, I find myself in the midst of "best practices" being developed around me in apparent real time. Two weeks ago, NerdDinner was my guide but with the development of MVC 2, even it seems outdated. It's an thrilling experience and I feel privileged to be in close contact with intelligent programmers daily. Right now I've stumbled upon an issue I can't seem to get a straight answer on - from all the blogs anyway - and I'd like to get some insight from the community. It's about Editing (read: Edit action). The bulk of material out there, tutorials and blogs, deal with creating and view the model. So while this question may not spell out a question, I hope to get some discussion going, contributing to my decision about the path of development I'm to take. My model represents a user with several fields like name, address and email. All the names, in fact, on field each for first name, last name and middle name. The Details view displays all these fields but you can change only one set of fields at a time, for instance, your names. The user expands a form while the other fields are still visible above and below. So the form that is posted back contains a subset of the fields representing the model. While this is appealing to us and our layout concerns, for various reasons, it is to be shunned by serious MVC-developers. I've been reading about some patterns and best practices and it seems that this is not in key with the paradigm of viewmodel == view. Or have I got it wrong? Anyway, NerdDinner dictates using FormCollection och UpdateModel. All the null fields are happily ignored. Since then, the MVC-community has abandoned this approach to such a degree that a bug in MVC 2 was not discovered. UpdateModel does not work without a complete model in your formcollection. The view model pattern receiving most praise seems to be Dedicated view model that contains a custom view model entity and is the only one that my design issue could be made compatible with. It entails a tedious amount of mapping, albeit lightened by the use of AutoMapper and the ideas of Jimmy Bogard, that may or may not be worthwhile. He also proposes a 1:1 relationship between view and view model. In keeping with these design paradigms, I am to create a view and associated view for each of my expanding sets of fields. The view models would each be nearly identical, differing only in the fields which are read-only, the views also containing much repeated markup. This seems absurd to me. In future I may want to be able to display two, more or all sets of fields open simultaneously. I will most attentively read the discussion I hope to spark. Many thanks in advance.

Read the article
CKEDITOR - Is there anyway to prevent formatting code in SOURCE mode?

- by Lev

I've spent the good portion of my day trying to figure this out, and I figured I'd finally just give in and ask. How can you prevent ANY automatic formatting when in SOURCE mode? I like to edit HTML source code directly instead of using the WYSIWYG interface, but whenever I write new lines, or layout tags how I would indent them, it all gets formatted when I switch to WYSIWYG mode and then back to SOURCE mode again. I stumbled upon this earlier: http://dev.fckeditor.net/ticket/993 That alluded to a setting which may have existed once upon a time which would be exactly what I'm after. I just want to know how I can completely turn off all automatic formatting when editing in SOURCE mode. After browsing this site for hours on end and finding absolutely nothing on here or on Google, I came up with a solution I thought would be foolproof (albeit not a pleasant one). I learned about the "protectedSource" setting, so I thought, well maybe I can just use that and create an HTML comment tag before all my HTML and another after it and then push a regular expression finding the comment tags into the protectedSource array, but even that (believe it or not) doesn't work. I've tried my expression straight up in the browser outside of CKEDITOR and it is working, but CKEDITOR doesn't protect the code as expected (which I suspect is a bug involving comment tags, since I can get it to work with other strings). In case you are wondering, this is what I had hoped would work as a work-around, but doesn't: config.protectedSource.push( /[\s\S]*/gi ); .. and what I planned on doing (for what appears to be the lack of a setting to disable formatting in SOURCE mode) was to nest all my HTML within the commented tags like this:  <div>some code that shouldn't be messed with (but is)</div>  I'd love to hear if you anyone has any suggestions for this scenario, or knows of a setting which I have described, or even if someone can just fill me in as to why I can't get protectedSource to work properly with two comment tags. I really thing it's gotta be a bug because I can get so many other expressions to work fine, and I can even protect HTML within the area of a single comment tag, but I simply cannot get HTML within two different comment tags to stay untouched. :( I've wasted about 6-7 hours on this so far today so if anyone can shed any light on it I would be very grateful! Thanks for reading! ;)

Read the article
Yii - Custom GridView with Multiple Tables

- by savinger

So, I've extended GridView to include an Advanced Search feature tailored to the needs of my organization. Filter - lets you show/hide columns in the table, and you can also reorder columns by dragging the little drag icon to the left of each item. Sort - Allows for the selection of multiple columns, specify Ascending or Descending. Search - Select your column and insert search parameters. Operators tailored to data type of selected column. Version 1 works, albeit slowly. Basically, I had my hands in the inner workings of CGridView, where I snatch the results from the DataProvider and do the searching and sorting in PHP before rendering the table contents. Now writing Version 2, where I aim to focus on clever CDbCriteria creation, allowing MySQL to do the heavy lifting so it will run quicker. The implementation is trivial when dealing with a single database table. The difficulty arises when I'm dealing with 2 or more tables... For example, if the user intends to search on a field that is a STAT relation, I need that relation to be present in my query. Here's the question. How do I assure that Yii includes all with relations in my query so that I include comparisons? I've included all my relations with my criteria in the model's search function and I've tried CDbCriteria's together ... public function search() { $criteria=new CDbCriteria; $criteria->compare('id', $this->id); $criteria->compare( ... ... $criteria->with = array('relation1','relation2','relation3'); $criteria->together = true; return new CActiveDataProvider( get_class($this), array( 'criteria'=>$criteria, 'pagination' => array('pageSize' => 50) ));} But I still get errors like this... CDbCommand failed to execute the SQL statement: SQLSTATE[42S22]: Column not found: 1054 Unknown column 't.relation3' in 'where clause'. The SQL statement executed was: SELECT COUNT(DISTINCT `t`.`id`) FROM `table` `t` LEFT OUTER JOIN `relation_table` `relation0` ON (`t`.`id`=`relation0`.`id`) LEFT OUTER JOIN `relation_table` `relation1` ON (`t`.`id`=`relation1`.`id`) WHERE (`t`.`relation3` < 1234567890) Where relation0 and relation1 are BELONGS_TO relations, but any STAT relations are missing. Furthermore, why is the query a SELECT COUNT(DISTINCT 't'.'id') ?

Read the article
What are the Options for Storing Hierarchical Data in a Relational Database?

- by orangepips

Good Overviews One more Nested Intervals vs. Adjacency List comparison: the best comparison of Adjacency List, Materialized Path, Nested Set and Nested Interval I've found. Models for hierarchical data: slides with good explanations of tradeoffs and example usage Representing hierarchies in MySQL: very good overview of Nested Set in particular Hierarchical data in RDBMSs: most comprehensive and well organized set of links I've seen, but not much in the way on explanation Options Ones I am aware of and general features: Adjacency List: Columns: ID, ParentID Easy to implement. Cheap node moves, inserts, and deletes. Expensive to find level (can store as a computed column), ancestry & descendants (Bridge Hierarchy combined with level column can solve), path (Lineage Column can solve). Use Common Table Expressions in those databases that support them to traverse. Nested Set (a.k.a Modified Preorder Tree Traversal) First described by Joe Celko - covered in depth in his book Trees and Hierarchies in SQL for Smarties Columns: Left, Right Cheap level, ancestry, descendants Compared to Adjacency List, moves, inserts, deletes more expensive. Requires a specific sort order (e.g. created). So sorting all descendants in a different order requires additional work. Nested Intervals Combination of Nested Sets and Materialized Path where left/right columns are floating point decimals instead of integers and encode the path information. Bridge Table (a.k.a. Closure Table: some good ideas about how to use triggers for maintaining this approach) Columns: ancestor, descendant Stands apart from table it describes. Can include some nodes in more than one hierarchy. Cheap ancestry and descendants (albeit not in what order) For complete knowledge of a hierarchy needs to be combined with another option. Flat Table A modification of the Adjacency List that adds a Level and Rank (e.g. ordering) column to each record. Expensive move and delete Cheap ancestry and descendants Good Use: threaded discussion - forums / blog comments Lineage Column (a.k.a. Materialized Path, Path Enumeration) Column: lineage (e.g. /parent/child/grandchild/etc...) Limit to how deep the hierarchy can be. Descendants cheap (e.g. LEFT(lineage, #) = '/enumerated/path') Ancestry tricky (database specific queries) Database Specific Notes MySQL Use session variables for Adjacency List Oracle Use CONNECT BY to traverse Adjacency Lists PostgreSQL ltree datatype for Materialized Path SQL Server General summary 2008 offers HierarchyId data type appears to help with Lineage Column approach and expand the depth that can be represented.

Read the article
Implementing coroutines in Java

- by JUST MY correct OPINION

This question is related to my question on existing coroutine implementations in Java. If, as I suspect, it turns out that there is no full implementation of coroutines currently available in Java, what would be required to implement them? As I said in that question, I know about the following: You can implement "coroutines" as threads/thread pools behind the scenes. You can do tricksy things with JVM bytecode behind the scenes to make coroutines possible. The so-called "Da Vinci Machine" JVM implementation has primitives that make coroutines doable without bytecode manipulation. There are various JNI-based approaches to coroutines also possible. I'll address each one's deficiencies in turn. Thread-based coroutines This "solution" is pathological. The whole point of coroutines is to avoid the overhead of threading, locking, kernel scheduling, etc. Coroutines are supposed to be light and fast and to execute only in user space. Implementing them in terms of full-tilt threads with tight restrictions gets rid of all the advantages. JVM bytecode manipulation This solution is more practical, albeit a bit difficult to pull off. This is roughly the same as jumping down into assembly language for coroutine libraries in C (which is how many of them work) with the advantage that you have only one architecture to worry about and get right. It also ties you down to only running your code on fully-compliant JVM stacks (which means, for example, no Android) unless you can find a way to do the same thing on the non-compliant stack. If you do find a way to do this, however, you have now doubled your system complexity and testing needs. The Da Vinci Machine The Da Vinci Machine is cool for experimentation, but since it is not a standard JVM its features aren't going to be available everywhere. Indeed I suspect most production environments would specifically forbid the use of the Da Vinci Machine. Thus I could use this to make cool experiments but not for any code I expect to release to the real world. This also has the added problem similar to the JVM bytecode manipulation solution above: won't work on alternative stacks (like Android's). JNI implementation This solution renders the point of doing this in Java at all moot. Each combination of CPU and operating system requires independent testing and each is a point of potentially frustrating subtle failure. Alternatively, of course, I could tie myself down to one platform entirely but this, too, makes the point of doing things in Java entirely moot. So... Is there any way to implement coroutines in Java without using one of these four techniques? Or will I be forced to use the one of those four that smells the least (JVM manipulation) instead?

Read the article
Where are the function literals in c++?

- by academicRobot

First of all, maybe literals is not the right term for this concept, but its the closest I could think of (not literals in the sense of functions as first class citizens). The idea is that when you make a conventional function call, it compiles to something like this: callq <immediate address> But if you make a function call using a function pointer, it compiles to something like this: mov <memory location>,%rax callq *%rax Which is all well and good. However, what if I'm writing a template library that requires a callback of some sort with a specified argument list and the user of the library is expected to know what function they want to call at compile time? Then I would like to write my template to accept a function literal as a template parameter. So, similar to template <int int_literal> struct my_template {...};` I'd like to write template <func_literal_t func_literal> struct my_template {...}; and have calls to func_literal within my_template compile to callq <immediate address>. Is there a facility in C++ for this, or a work around to achieve the same effect? If not, why not (e.g. some cataclysmic side effects)? How about C++0x or another language? Solutions that are not portable are fine. Solutions that include the use of member function pointers would be ideal. I'm not particularly interested in being told "You are a <socially unacceptable term for a person of low IQ>, just use function pointers/functors." This is a curiosity based question, and it seems that it might be useful in some (albeit limited) applications. It seems like this should be possible since function names are just placeholders for a (relative) memory address, so why not allow more liberal use (e.g. aliasing) of this placeholder. p.s. I use function pointers and functions objects all the the time and they are great. But this post got me thinking about the don't pay for what you don't use principle in relation to function calls, and it seems like forcing the use of function pointers or similar facility when the function is known at compile time is a violation of this principle, though a small one.

Read the article
CSS Z-Index with Gradient Background

- by Jona

I'm making a small webpage where the I would like the top banner with some text to remain on top, as such: HTML: <div id = "topBanner"> <h1>Some Text</h1> </div> CSS: #topBanner{ position:fixed; background-color: #CCCCCC; width: 100%; height:200px; top:0; left:0; z-index:900; background: -moz-linear-gradient(top, rgba(204,204,204,0.65) 0%, rgba(204,204,204,0.44) 32%, rgba(204,204,204,0.12) 82%, rgba(204,204,204,0) 100%); /* FF3.6+ */ background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,rgba(204,204,204,0.65)), color-stop(32%,rgba(204,204,204,0.44)), color-stop(82%,rgba(204,204,204,0.12)), color-stop(100%,rgba(204,204,204,0))); /* Chrome,Safari4+ */ background: -webkit-linear-gradient(top, rgba(204,204,204,0.65) 0%,rgba(204,204,204,0.44) 32%,rgba(204,204,204,0.12) 82%,rgba(204,204,204,0) 100%); /* Chrome10+,Safari5.1+ */ background: -o-linear-gradient(top, rgba(204,204,204,0.65) 0%,rgba(204,204,204,0.44) 32%,rgba(204,204,204,0.12) 82%,rgba(204,204,204,0) 100%); /* Opera 11.10+ */ background: -ms-linear-gradient(top, rgba(204,204,204,0.65) 0%,rgba(204,204,204,0.44) 32%,rgba(204,204,204,0.12) 82%,rgba(204,204,204,0) 100%); /* IE10+ */ background: linear-gradient(to bottom, rgba(204,204,204,0.65) 0%,rgba(204,204,204,0.44) 32%,rgba(204,204,204,0.12) 82%,rgba(204,204,204,0) 100%); /* W3C */ filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#a6cccccc', endColorstr='#00cccccc',GradientType=0 ); /* IE6-9 */ } /*WebPage Header*/ h1{ font-size:3em; color:blue; text-shadow:#CCCCCC 2px 2px 2px, #000 0 -1px 2px; position: absolute; width: 570px; left:50%; right:50%; line-height:20px; margin-left: -285px; z-index:999; } The z-index works fine, except that because I'm using a gradient any time I scroll down the elements behind the banner are still visible, albeit somewhat transparent. Is there any way to make them total invisible? i.e., what I'm trying to do is make it as though the banner is a solid color, even though it's a gradient. Thanks in advance for any help!

Read the article
Google Maps rendering locally but not in live environment

- by marcusstarnes

I have a page that renders a simple google map for a specified location. This map renders without any problems at all when I run it locally on localhost, however, when I deploy this code to our live web servers (using our LIVE google API key for the appropriate domain) it fails to render, and upon putting a series of alerts within the javascript on the page, it appears that the 'Initialize' method (which should be called within body onLoad) is not being called. When I view the HTML source that is rendered on the live server it appears exactly as per the local version of the site (including the call to initialize() within the body onLoad event), albeit with the different maps API key. I have output the host (alert(window.location.host);) to ensure that the key I generated via the google maps api site, corresponds exactly to the live server, which it does. Does anyone have any ideas why it would be working locally but not when deployed to the live servers? The live site is hosted on 2 load-balanced web servers. This is the javascript that is rendered: <script src="http://maps.google.com/maps?file=api&v=2&sensor=false&key=ABQIAAAA-BU8POZj19wRlTaKIXVM9xTz76xxk4yAELG9u79oXrhnLTB5NRRvAZ-bkKn1x8J68nfRTVOIWNPJEA" type="text/javascript"></script> <script type="text/javascript"> var map; var geocoder; alert(window.location.host); function initialize() { if (GBrowserIsCompatible()) { map = new GMap2(document.getElementById("businessMap")); map.setUIToDefault(); geocoder = new GClientGeocoder(); showAddress('St Margarets Street SW1P 3 London'); } } function showAddress(address) { geocoder.getLatLng( address, function(point) { if (!point) { // Address could not be located. jQuery('#googleMap').hide(); } else { map.setCenter(point, 13); var marker = new GMarker(point); map.addOverlay(marker); var html = 'Address info for the marker'; marker.openInfoWindow(html); GEvent.addListener(marker, "click", function() { marker.openInfoWindowHtml(html); }); } } ); } </script> Any help would be much appreciated. Thanks.

Read the article
Python: (sampling with replacement): efficient algorithm to extract the set of UNIQUE N-tuples from a set

- by Homunculus Reticulli

I have a set of items, from which I want to select DISSIMILAR tuples (more on the definition of dissimilar touples later). The set could contain potentially several thousand items, although typically, it would contain only a few hundreds. I am trying to write a generic algorithm that will allow me to select N items to form an N-tuple, from the original set. The new set of selected N-tuples should be DISSIMILAR. A N-tuple A is said to be DISSIMILAR to another N-tuple B if and only if: Every pair (2-tuple) that occurs in A DOES NOT appear in B Note: For this algorithm, A 2-tuple (pair) is considered SIMILAR/IDENTICAL if it contains the same elements, i.e. (x,y) is considered the same as (y,x). This is a (possible variation on the) classic Urn Problem. A trivial (pseudocode) implementation of this algorithm would be something along the lines of def fetch_unique_tuples(original_set, tuple_size): while True: # randomly select [tuple_size] items from the set to create first set # create a key or hash from the N elements and store in a set # store selected N-tuple in a container if end_condition_met: break I don't think this is the most efficient way of doing this - and though I am no algorithm theorist, I suspect that the time for this algorithm to run is NOT O(n) - in fact, its probably more likely to be O(n!). I am wondering if there is a more efficient way of implementing such an algo, and preferably, reducing the time to O(n). Actually, as Mark Byers pointed out there is a second variable m, which is the size of the number of elements being selected. This (i.e. m) will typically be between 2 and 5. Regarding examples, here would be a typical (albeit shortened) example: original_list = ['CAGG', 'CTTC', 'ACCT', 'TGCA', 'CCTG', 'CAAA', 'TGCC', 'ACTT', 'TAAT', 'CTTG', 'CGGC', 'GGCC', 'TCCT', 'ATCC', 'ACAG', 'TGAA', 'TTTG', 'ACAA', 'TGTC', 'TGGA', 'CTGC', 'GCTC', 'AGGA', 'TGCT', 'GCGC', 'GCGG', 'AAAG', 'GCTG', 'GCCG', 'ACCA', 'CTCC', 'CACG', 'CATA', 'GGGA', 'CGAG', 'CCCC', 'GGTG', 'AAGT', 'CCAC', 'AACA', 'AATA', 'CGAC', 'GGAA', 'TACC', 'AGTT', 'GTGG', 'CGCA', 'GGGG', 'GAGA', 'AGCC', 'ACCG', 'CCAT', 'AGAC', 'GGGT', 'CAGC', 'GATG', 'TTCG'] Select 3-tuples from the original list should produce a list (or set) similar to: [('CAGG', 'CTTC', 'ACCT') ('CAGG', 'TGCA', 'CCTG') ('CAGG', 'CAAA', 'TGCC') ('CAGG', 'ACTT', 'ACCT') ('CAGG', 'CTTG', 'CGGC') .... ('CTTC', 'TGCA', 'CAAA') ] [[Edit]] Actually, in constructing the example output, I have realized that the earlier definition I gave for UNIQUENESS was incorrect. I have updated my definition and have introduced a new metric of DISSIMILARITY instead, as a result of this finding.

Read the article
Where are the function literals c++?

- by academicRobot

First of all, maybe literals is not the right term for this concept, but its the closest I could think of (not literals in the sense of functions as first class citizens). The idea is that when you make a conventional function call, it compiles to something like this: callq <immediate address> But if you make a function call using a function pointer, it compiles to something like this: mov <memory location>,%rax callq *%rax Which is all well and good. However, what if I'm writing a template library that requires a callback of some sort with a specified argument list and the user of the library is expected to know what function they want to call at compile time? Then I would like to write my template to accept a function literal as a template parameter. So, similar to template <int int_literal> struct my_template {...};` I'd like to write template <func_literal_t func_literal> struct my_template {...}; and have calls to func_literal within my_template compile to callq <immediate address>. Is there a facility in C++ for this, or a work around to achieve the same effect? If not, why not (e.g. some cataclysmic side effects)? How about C++0x or another language? Solutions that are not portable are fine. Solutions that include the use of member function pointers would be ideal. I'm not particularly interested in being told "You are a <socially unacceptable term for a person of low IQ>, just use function pointers/functors." This is a curiosity based question, and it seems that it might be useful in some (albeit limited) applications. It seems like this should be possible since function names are just placeholders for a (relative) memory address, so why not allow more liberal use (e.g. aliasing) of this placeholder. p.s. I use function pointers and functions objects all the the time and they are great. But this post got me thinking about the don't pay for what you don't use principle in relation to function calls, and it seems like forcing the use of function pointers or similar facility when the function is known at compile time is a violation of this principle, though a small one.

Read the article
Python: (sampling with replacement): efficient algorithm to extract the set of DISSIMILAR N-tuples from a set

- by Homunculus Reticulli

I have a set of items, from which I want to select DISSIMILAR tuples (more on the definition of dissimilar touples later). The set could contain potentially several thousand items, although typically, it would contain only a few hundreds. I am trying to write a generic algorithm that will allow me to select N items to form an N-tuple, from the original set. The new set of selected N-tuples should be DISSIMILAR. A N-tuple A is said to be DISSIMILAR to another N-tuple B if and only if: Every pair (2-tuple) that occurs in A DOES NOT appear in B Note: For this algorithm, A 2-tuple (pair) is considered SIMILAR/IDENTICAL if it contains the same elements, i.e. (x,y) is considered the same as (y,x). This is a (possible variation on the) classic Urn Problem. A trivial (pseudocode) implementation of this algorithm would be something along the lines of def fetch_unique_tuples(original_set, tuple_size): while True: # randomly select [tuple_size] items from the set to create first set # create a key or hash from the N elements and store in a set # store selected N-tuple in a container if end_condition_met: break I don't think this is the most efficient way of doing this - and though I am no algorithm theorist, I suspect that the time for this algorithm to run is NOT O(n) - in fact, its probably more likely to be O(n!). I am wondering if there is a more efficient way of implementing such an algo, and preferably, reducing the time to O(n). Actually, as Mark Byers pointed out there is a second variable m, which is the size of the number of elements being selected. This (i.e. m) will typically be between 2 and 5. Regarding examples, here would be a typical (albeit shortened) example: original_list = ['CAGG', 'CTTC', 'ACCT', 'TGCA', 'CCTG', 'CAAA', 'TGCC', 'ACTT', 'TAAT', 'CTTG', 'CGGC', 'GGCC', 'TCCT', 'ATCC', 'ACAG', 'TGAA', 'TTTG', 'ACAA', 'TGTC', 'TGGA', 'CTGC', 'GCTC', 'AGGA', 'TGCT', 'GCGC', 'GCGG', 'AAAG', 'GCTG', 'GCCG', 'ACCA', 'CTCC', 'CACG', 'CATA', 'GGGA', 'CGAG', 'CCCC', 'GGTG', 'AAGT', 'CCAC', 'AACA', 'AATA', 'CGAC', 'GGAA', 'TACC', 'AGTT', 'GTGG', 'CGCA', 'GGGG', 'GAGA', 'AGCC', 'ACCG', 'CCAT', 'AGAC', 'GGGT', 'CAGC', 'GATG', 'TTCG'] # Select 3-tuples from the original list should produce a list (or set) similar to: [('CAGG', 'CTTC', 'ACCT') ('CAGG', 'TGCA', 'CCTG') ('CAGG', 'CAAA', 'TGCC') ('CAGG', 'ACTT', 'ACCT') ('CAGG', 'CTTG', 'CGGC') .... ('CTTC', 'TGCA', 'CAAA') ] [[Edit]] Actually, in constructing the example output, I have realized that the earlier definition I gave for UNIQUENESS was incorrect. I have updated my definition and have introduced a new metric of DISSIMILARITY instead, as a result of this finding.

Read the article

Search Results

Search found 260 results on 11 pages for 'albeit'.

Page 9/11 | < Previous Page | 5 6 7 8 9 10 11 | Next Page >

- by Marco

- by kce

- by Mark S. Rasmussen

- by TK Kocheran

- by Alec Matusis

- by Patrick

- by mtf1200

- by Roger Hart

- by JonathonG

- by BuckWoody

- by PointsToShare

- by acheong87

- by Wayne M

- by Ross

- by Martin

- by Lev

- by savinger

- by orangepips

- by JUST MY correct OPINION

- by academicRobot

- by Jona

- by marcusstarnes

- by Homunculus Reticulli

- by academicRobot

- by Homunculus Reticulli

< Previous Page | 5 6 7 8 9 10 11 | Next Page >