Search Results

Search found 19 results on 1 pages for 'garvin'.

Page 1/1 | 1

How do I get a different type of scrollbars in 12.04? [closed]

- by Joseph Garvin

Possible Duplicate: How do I disable overlay scrollbars? By default 12.04 uses overlay scroll bars that do not suit my taste, and every method I've found so far of disabling them makes them broken in a different way. When I was using 11.10 this wasn't a problem because I could still change the GTK theme. In 12.04, the Appearance settings only contain a few stock themes, and other than the special purpose contrast ones they all have the overlay scroll bars. If I aptitude search gtk3 | grep theme I get no results so there appears to be no packaged alternative either. Most suggestions I've seen for disabling the overlay scroll bars involve uninstalling packages or editing files as root. I want to disable them just for the current user, not for everyone on the whole box; as should be the case for any theme/display setting. There is a gsettings command that temporarily disables the overlay scrollbars just for the current user, but this has two problems of its own: The setting doesn't stick after log off. Because who would want to save settings? The scroll bars put in place have no contrast. They have a black scroller on a black background and are completely unusable. In short what I'd like to know is how to disable overlay scroll bars such that: My preference is user specific. My preference is actually saved. The scroller can actually be seen against the background without having to use a special high contrast theme that makes my whole desktop look like a negative photo from Tron.

Read the article
When running a shell script, how can you protect it from overwriting or truncating files?

- by Joseph Garvin

If while an application is running one of the shared libraries it uses is written to or truncated, then the application will crash. Moving the file or removing it wholesale with 'rm' will not cause a crash, because the OS (Solaris in this case but I assume this is true on Linux and other *nix as well) is smart enough to not delete the inode associated with the file while any process has it open. I have a shell script that performs installation of shared libraries. Sometimes, it may be used to reinstall versions of shared libraries that were already installed, without an uninstall first. Because applications may be using the already installed shared libraries, it's important the the script is smart enough to rm the files or move them out of the way (e.g. to a 'deleted' folder that cron could empty at a time when we know no applications will be running) before installing the new ones so that they're not overwritten or truncated. Unfortunately, recently an application crashed just after an install. Coincidence? It's difficult to tell. The real solution here is to switch over to a more robust installation method than an old gigantic shell script, but it'd be nice to have some extra protection until the switch is made. Is there any way to wrap a shell script to protect it from overwriting or truncating files (and ideally failing loudly), but still allowing them to be moved or rm'd? Standard UNIX file permissions won't do the trick because you can't distinguish moving/removing from overwriting/truncating. Aliases could work but I'm not sure what entirety of commands need to be aliased. I imagine something like truss/strace except before each action it checks against a filter whether to actually do it. I don't need a perfect solution that would work even against an intentionally malicious script. Ideas I have so far: Alias cp to GNU cp (not the default since I'm on Solaris) and use the --remove-destination option. Alias install to GNU install and use the --backup option. It might be smart enough to move the existing file to the backup file name rather than making a copy, thus preserving the inode. "set noclobber" in ~/.bashrc so that I/O redirection won't overwrite files

Read the article
Is it possible to share a C struct in shared memory between apps compiled with different compilers?

- by Joseph Garvin

I realize that in general the C and C++ standards gives compiler writers a lot of latitude. But in particular it guarantees that POD types like C struct members have to be laid out in memory the same order that they're listed in the structs definition, and most compilers provide extensions letting you fix the alignment of members. So if you had a header that defined a struct and manually specified the alignment of its members, then compiled two apps with different compilers using the header, shouldn't one app be able to write an instance of the struct into shared memory and the other app be able to read it without errors? I am assuming though that the size of the types contained is consistent across two compilers on the same architecture (it has to be the same platform already since we're talking about shared memory). I realize that this is not always true for some types (e.g. long vs. long long in GCC and MSVC 64-bit) but nowadays there are uint16_t, uint32_t, etc. types, and float and double are specified by IEEE standards.

Read the article
How to mmap the stack for the clone() system call on linux?

- by Joseph Garvin

The clone() system call on Linux takes a parameter pointing to the stack for the new created thread to use. The obvious way to do this is to simply malloc some space and pass that, but then you have to be sure you've malloc'd as much stack space as that thread will ever use (hard to predict). I remembered that when using pthreads I didn't have to do this, so I was curious what it did instead. I came across this site which explains, "The best solution, used by the Linux pthreads implementation, is to use mmap to allocate memory, with flags specifying a region of memory which is allocated as it is used. This way, memory is allocated for the stack as it is needed, and a segmentation violation will occur if the system is unable to allocate additional memory." The only context I've ever heard mmap used in is for mapping files into memory, and indeed reading the mmap man page it takes a file descriptor. How can this be used for allocating a stack of dynamic length to give to clone()? Is that site just crazy? ;) In either case, doesn't the kernel need to know how to find a free bunch of memory for a new stack anyway, since that's something it has to do all the time as the user launches new processes? Why does a stack pointer even need to be specified in the first place if the kernel can already figure this out?

Read the article
How do you implement Software Transactional Memory?

- by Joseph Garvin

In terms of actual low level atomic instructions and memory fences (I assume they're used), how do you implement STM? The part that's mysterious to me is that given some arbitrary chunk of code, you need a way to go back afterward and determine if the values used in each step were valid. How do you do that, and how do you do it efficiently? This would also seem to suggest that just like any other 'locking' solution you want to keep your critical sections as small as possible (to decrease the probability of a conflict), am I right? Also, can STM simply detect "another thread entered this area while the computation was executing, therefore the computation is invalid" or can it actually detect whether clobbered values were used (and thus by luck sometimes two threads may execute the same critical section simultaneously without need for rollback)?

Read the article
Why does one loop take longer to detect a shared memory update than another loop?

- by Joseph Garvin

I've written a 'server' program that writes to shared memory, and a client program that reads from the memory. The server has different 'channels' that it can be writing to, which are just different linked lists that it's appending items too. The client is interested in some of the linked lists, and wants to read every node that's added to those lists as it comes in, with the minimum latency possible. I have 2 approaches for the client: For each linked list, the client keeps a 'bookmark' pointer to keep its place within the linked list. It round robins the linked lists, iterating through all of them over and over (it loops forever), moving each bookmark one node forward each time if it can. Whether it can is determined by the value of a 'next' member of the node. If it's non-null, then jumping to the next node is safe (the server switches it from null to non-null atomically). This approach works OK, but if there are a lot of lists to iterate over, and only a few of them are receiving updates, the latency gets bad. The server gives each list a unique ID. Each time the server appends an item to a list, it also appends the ID number of the list to a master 'update list'. The client only keeps one bookmark, a bookmark into the update list. It endlessly checks if the bookmark's next pointer is non-null ( while(node->next_ == NULL) {} ), if so moves ahead, reads the ID given, and then processes the new node on the linked list that has that ID. This, in theory, should handle large numbers of lists much better, because the client doesn't have to iterate over all of them each time. When I benchmarked the latency of both approaches (using gettimeofday), to my surprise #2 was terrible. The first approach, for a small number of linked lists, would often be under 20us of latency. The second approach would have small spats of low latencies but often be between 4,000-7,000us! Through inserting gettimeofday's here and there, I've determined that all of the added latency in approach #2 is spent in the loop repeatedly checking if the next pointer is non-null. This is puzzling to me; it's as if the change in one process is taking longer to 'publish' to the second process with the second approach. I assume there's some sort of cache interaction going on I don't understand. What's going on?

Read the article
Can you force a crash if a write occurs to a given memory location with finer than page granularity?

- by Joseph Garvin

I'm writing a program that for performance reasons uses shared memory (alternatives have been evaluated, and they are not fast enough for my task, so suggestions to not use it will be downvoted). In the shared memory region I am writing many structs of a fixed size. There is one program responsible for writing the structs into shared memory, and many clients that read from it. However, there is one member of each struct that clients need to write to (a reference count, which they will update atomically). All of the other members should be read only to the clients. Because clients need to change that one member, they can't map the shared memory region as read only. But they shouldn't be tinkering with the other members either, and since these programs are written in C++, memory corruption is possible. Ideally, it should be as difficult as possible for one client to crash another. I'm only worried about buggy clients, not malicious ones, so imperfect solutions are allowed. I can try to stop clients from overwriting by declaring the members in the header they use as const, but that won't prevent memory corruption (buffer overflows, bad casts, etc.) from overwriting. I can insert canaries, but then I have to constantly pay the cost of checking them. Instead of storing the reference count member directly, I could store a pointer to the actual data in a separate mapped write only page, while keeping the structs in read only mapped pages. This will work, the OS will force my application to crash if I try to write to the pointed to data, but indirect storage can be undesirable when trying to write lock free algorithms, because needing to follow another level of indirection can change whether something can be done atomically. Is there any way to mark smaller areas of memory such that writing them will cause your app to blow up? Some platforms have hardware watchpoints, and maybe I could activate one of those with inline assembly, but I'd be limited to only 4 at a time on 32-bit x86 and each one could only cover part of the struct because they're limited to 4 bytes. It'd also make my program painful to debug ;)

Read the article
How can I use functools.partial on multiple methods on an object, and freeze parameters out of order

- by Joseph Garvin

I find functools.partial to be extremely useful, but I would like to be able to freeze arguments out of order (the argument you want to freeze is not always the first one) and I'd like to be able to apply it to several methods on a class at once, to make a proxy object that has the same methods as the underlying object except with some of its methods parameter being frozen (think of it as generalizing partial to apply to classes). I've managed to scrap together a version of functools.partial called 'bind' that lets me specify parameters out of order by passing them by keyword argument. That part works: >>> def foo(x, y): ... print x, y ... >>> bar = bind(foo, y=3) >>> bar(2) 2 3 But my proxy class does not work, and I'm not sure why: >>> class Foo(object): ... def bar(self, x, y): ... print x, y ... >>> a = Foo() >>> b = PureProxy(a, bar=bind(Foo.bar, y=3)) >>> b.bar(2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: bar() takes exactly 3 arguments (2 given) I'm probably doing this all sorts of wrong because I'm just going by what I've pieced together from random documentation, blogs, and running dir() on all the pieces. Suggestions both on how to make this work and better ways to implement it would be appreciated ;) One detail I'm unsure about is how this should all interact with descriptors. Code follows. from types import MethodType class PureProxy(object): def __init__(self, underlying, **substitutions): self.underlying = underlying for name in substitutions: subst_attr = substitutions[name] if hasattr(subst_attr, "underlying"): setattr(self, name, MethodType(subst_attr, self, PureProxy)) def __getattribute__(self, name): return getattr(object.__getattribute__(self, "underlying"), name) def bind(f, *args, **kwargs): """ Lets you freeze arguments of a function be certain values. Unlike functools.partial, you can freeze arguments by name, which has the bonus of letting you freeze them out of order. args will be treated just like partial, but kwargs will properly take into account if you are specifying a regular argument by name. """ argspec = inspect.getargspec(f) argdict = copy(kwargs) if hasattr(f, "im_func"): f = f.im_func args_idx = 0 for arg in argspec.args: if args_idx >= len(args): break argdict[arg] = args[args_idx] args_idx += 1 num_plugged = args_idx def new_func(*inner_args, **inner_kwargs): args_idx = 0 for arg in argspec.args[num_plugged:]: if arg in argdict: continue if args_idx >= len(inner_args): # We can't raise an error here because some remaining arguments # may have been passed in by keyword. break argdict[arg] = inner_args[args_idx] args_idx += 1 f(**dict(argdict, **inner_kwargs)) new_func.underlying = f return new_func

Read the article
Why does std::cout convert volatile pointers to bool?

- by Joseph Garvin

If you try to cout a volatile pointer, even a volatile char pointer where you would normally expect cout to print the string, you will instead simply get '1' (assuming the pointer is not null I think). I assume output stream operator<< is template specialized for volatile pointers, but my question is, why? What use case motivates this behavior? Example code: #include <iostream> #include <cstring> int main() { char x[500]; std::strcpy(x, "Hello world"); int y; int *z = &y; std::cout << x << std::endl; std::cout << (char volatile*)x << std::endl; std::cout << z << std::endl; std::cout << (int volatile*)z << std::endl; return 0; } Output: Hello world 1 0x8046b6c 1

Read the article
Of these 3 methods for reading linked lists from shared memory, why is the 3rd fastest?

- by Joseph Garvin

I have a 'server' program that updates many linked lists in shared memory in response to external events. I want client programs to notice an update on any of the lists as quickly as possible (lowest latency). The server marks a linked list's node's state_ as FILLED once its data is filled in and its next pointer has been set to a valid location. Until then, its state_ is NOT_FILLED_YET. I am using memory barriers to make sure that clients don't see the state_ as FILLED before the data within is actually ready (and it seems to work, I never see corrupt data). Also, state_ is volatile to be sure the compiler doesn't lift the client's checking of it out of loops. Keeping the server code exactly the same, I've come up with 3 different methods for the client to scan the linked lists for changes. The question is: Why is the 3rd method fastest? Method 1: Round robin over all the linked lists (called 'channels') continuously, looking to see if any nodes have changed to 'FILLED': void method_one() { std::vector<Data*> channel_cursors; for(ChannelList::iterator i = channel_list.begin(); i != channel_list.end(); ++i) { Data* current_item = static_cast<Data*>(i->get(segment)->tail_.get(segment)); channel_cursors.push_back(current_item); } while(true) { for(std::size_t i = 0; i < channel_list.size(); ++i) { Data* current_item = channel_cursors[i]; ACQUIRE_MEMORY_BARRIER; if(current_item->state_ == NOT_FILLED_YET) { continue; } log_latency(current_item->tv_sec_, current_item->tv_usec_); channel_cursors[i] = static_cast<Data*>(current_item->next_.get(segment)); } } } Method 1 gave very low latency when then number of channels was small. But when the number of channels grew (250K+) it became very slow because of looping over all the channels. So I tried... Method 2: Give each linked list an ID. Keep a separate 'update list' to the side. Every time one of the linked lists is updated, push its ID on to the update list. Now we just need to monitor the single update list, and check the IDs we get from it. void method_two() { std::vector<Data*> channel_cursors; for(ChannelList::iterator i = channel_list.begin(); i != channel_list.end(); ++i) { Data* current_item = static_cast<Data*>(i->get(segment)->tail_.get(segment)); channel_cursors.push_back(current_item); } UpdateID* update_cursor = static_cast<UpdateID*>(update_channel.tail_.get(segment)); while(true) { if(update_cursor->state_ == NOT_FILLED_YET) { continue; } ::uint32_t update_id = update_cursor->list_id_; Data* current_item = channel_cursors[update_id]; if(current_item->state_ == NOT_FILLED_YET) { std::cerr << "This should never print." << std::endl; // it doesn't continue; } log_latency(current_item->tv_sec_, current_item->tv_usec_); channel_cursors[update_id] = static_cast<Data*>(current_item->next_.get(segment)); update_cursor = static_cast<UpdateID*>(update_cursor->next_.get(segment)); } } Method 2 gave TERRIBLE latency. Whereas Method 1 might give under 10us latency, Method 2 would inexplicably often given 8ms latency! Using gettimeofday it appears that the change in update_cursor-state_ was very slow to propogate from the server's view to the client's (I'm on a multicore box, so I assume the delay is due to cache). So I tried a hybrid approach... Method 3: Keep the update list. But loop over all the channels continuously, and within each iteration check if the update list has updated. If it has, go with the number pushed onto it. If it hasn't, check the channel we've currently iterated to. void method_three() { std::vector<Data*> channel_cursors; for(ChannelList::iterator i = channel_list.begin(); i != channel_list.end(); ++i) { Data* current_item = static_cast<Data*>(i->get(segment)->tail_.get(segment)); channel_cursors.push_back(current_item); } UpdateID* update_cursor = static_cast<UpdateID*>(update_channel.tail_.get(segment)); while(true) { for(std::size_t i = 0; i < channel_list.size(); ++i) { std::size_t idx = i; ACQUIRE_MEMORY_BARRIER; if(update_cursor->state_ != NOT_FILLED_YET) { //std::cerr << "Found via update" << std::endl; i--; idx = update_cursor->list_id_; update_cursor = static_cast<UpdateID*>(update_cursor->next_.get(segment)); } Data* current_item = channel_cursors[idx]; ACQUIRE_MEMORY_BARRIER; if(current_item->state_ == NOT_FILLED_YET) { continue; } found_an_update = true; log_latency(current_item->tv_sec_, current_item->tv_usec_); channel_cursors[idx] = static_cast<Data*>(current_item->next_.get(segment)); } } } The latency of this method was as good as Method 1, but scaled to large numbers of channels. The problem is, I have no clue why. Just to throw a wrench in things: if I uncomment the 'found via update' part, it prints between EVERY LATENCY LOG MESSAGE. Which means things are only ever found on the update list! So I don't understand how this method can be faster than method 2. The full, compilable code (requires GCC and boost-1.41) that generates random strings as test data is at: http://pastebin.com/e3HuL0nr

Read the article
Do condition variables still need a mutex if you're changing the checked value atomically?

- by Joseph Garvin

Here is the typical way to use a condition variable: // The reader(s) lock(some_mutex); if(protected_by_mutex_var != desired_value) some_condition.wait(some_mutex); unlock(some_mutex); // The writer lock(some_mutex); protected_by_mutex_var = desired_value; unlock(some_mutex); some_condition.notify_all(); But if protected_by_mutex_var is set atomically by say, a compare-and-swap instruction, does the mutex serve any purpose (other than that pthreads and other APIs require you to pass in a mutex)? Is it protecting state used to implement the condition? If not, is it safe then to do this?: // The writer protected_by_mutex_var = desired_value; some_condition.notify_all(); With the writer never directly interacting with the reader's mutex? If so, is it even necessary that different readers use the same mutex?

Read the article
How do you implement Software Transactional Memory?

- by Joseph Garvin

In terms of actual low level atomic instructions and memory fences (I assume they're used), how do you implement STM? The part that's mysterious to me is that given some arbitrary chunk of code, you need a way to go back afterward and determine if the values used in each step were valid. How do you do that, and how do you do it efficiently? This would also seem to suggest that just like any other 'locking' solution you want to keep your critical sections as small as possible (to decrease the probability of a conflict), am I right? Also, can STM simply detect "another thread entered this area while the computation was executing, therefore the computation is invalid" or can it actually detect whether clobbered values were used (and thus by luck sometimes two threads may execute the same critical section simultaneously without need for rollback)?

Read the article
When is a C++ terminate handler the Right Thing(TM)?

- by Joseph Garvin

The C++ standard provides the std::set_terminate function which lets you specify what function std::terminate should actually call. std::terminate should only get called in dire circumstances, and sure enough the situations the standard describes for when it's called are dire (e.g. an uncaught exception). When std::terminate does get called the situation seems analagous to being out of memory -- there's not really much you can sensically do. I've read that it can be used to make sure resources are freed -- but for the majority of resources this should be handled automatically by the OS when the process exits (e.g. file handles). Theoretically I can see a case for if say, you needed to send a server a specific message when exiting due to a crash. But the majority of the time the OS handling should be sufficient. When is using a terminate handler the Right Thing(TM)? Update: People interested in what can be done with custom terminate handlers might find this non-portable trick useful.

Read the article
Is it possible to store pointers in shared memory without using offsets?

- by Joseph Garvin

When using shared memory, each process may mmap the shared region into a different area of their address space. This means that when storing pointers within the shared region, you need to store them as offsets of the start of the shared region. Unfortunately, this complicates use of atomic instructions (e.g. if you're trying to write a lock free algorithm). For example, say you have a bunch of reference counted nodes in shared memory, created by a single writer. The writer periodically atomically updates a pointer 'p' to point to a valid node with positive reference count. Readers want to atomically write to 'p' because it points to the beginning of a node (a struct) whose first element is a reference count. Since p always points to a valid node, incrementing the ref count is safe, and makes it safe to dereference 'p' and access other members. However, this all only works when everything is in the same address space. If the nodes and the 'p' pointer are stored in shared memory, then clients suffer a race condition: x = read p y = x + offset Increment refcount at y During step 2, p may change and x may no longer point to a valid node. The only workaround I can think of is somehow forcing all processes to agree on where to map the shared memory, so that real pointers rather than offsets can be stored in the mmap'd region. Is there any way to do that? I see MAP_FIXED in the mmap documentation, but I don't know how I could pick an address that would be safe.

Read the article
Wait on multiple condition variables on Linux without unnecessary sleeps?

- by Joseph Garvin

I'm writing a latency sensitive app that in effect wants to wait on multiple condition variables at once. I've read before of several ways to get this functionality on Linux (apparently this is builtin on Windows), but none of them seem suitable for my app. The methods I know of are: Have one thread wait on each of the condition variables you want to wait on, which when woken will signal a single condition variable which you wait on instead. Cycling through multiple condition variables with a timed wait. Writing dummy bytes to files or pipes instead, and polling on those. #1 & #2 are unsuitable because they cause unnecessary sleeping. With #1, you have to wait for the dummy thread to wake up, then signal the real thread, then for the real thread to wake up, instead of the real thread just waking up to begin with -- the extra scheduler quantum spent on this actually matters for my app, and I'd prefer not to have to use a full fledged RTOS. #2 is even worse, you potentially spend N * timeout time asleep, or your timeout will be 0 in which case you never sleep (endlessly burning CPU and starving other threads is also bad). For #3, pipes are problematic because if the thread being 'signaled' is busy or even crashes (I'm in fact dealing with separate process rather than threads -- the mutexes and conditions would be stored in shared memory), then the writing thread will be stuck because the pipe's buffer will be full, as will any other clients. Files are problematic because you'd be growing it endlessly the longer the app ran. Is there a better way to do this? Curious for answers appropriate for Solaris as well.

Read the article
Is it normal for C++ static initialization to appear twice in the same backtrace?

- by Joseph Garvin

I'm trying to debug a C++ program compiled with GCC that freezes at startup. GCC mutex protects function's static local variables, and it appears that waiting to acquire such a lock is why it freezes. How this happens is rather confusing. First module A's static initialization occurs (there are __static_init functions GCC invokes that are visible in the backtrace), which calls a function Foo(), that has a static local variable. The static local variable is an object who's constructor calls through several layers of functions, then suddenly the backtrace has a few ??'s, and then it's is in the static initialization of a second module B (the __static functions occur all over again), which then calls Foo(), but since Foo() never returned the first time the mutex on the local static variable is still set, and it locks. How can one static init trigger another? My first theory was shared libraries -- that module A would be calling some function in module B that would cause module B to load, thus triggering B's static init, but that doesn't appear to be the case. Module A doesn't use module B at all. So I have a second (and horrifying) guess. Say that: Module A uses some templated function or a function in a templated class, e.g. foo<int>::bar() Module B also uses foo<int>::bar() Module A doesn't depend on module B at all At link time, the linker has two instances of foo<int>::bar(), but this is OK because template functions are marked as weak symbols... At runtime, module A calls foo<int>::bar, and the static init of module B is triggered, even though module B doesn't depend on module A! Why? Because the linker decided to go with module B's instance of foo::bar instead of module A's instance at link time. Is this particular scenario valid? Or should one module's static init never trigger static init in another module?

Read the article
Why can't I pass self as a named argument to an instance method in Python?

- by Joseph Garvin

This works: >>> def bar(x, y): ... print x, y ... >>> bar(y=3, x=1) 1 3 And this works: >>> class foo(object): ... def bar(self, x, y): ... print x, y ... >>> z = foo() >>> z.bar(y=3, x=1) 1 3 And even this works: >>> foo.bar(z, y=3, x=1) 1 3 But why doesn't this work? >>> foo.bar(self=z, y=3, x=1) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unbound method bar() must be called with foo instance as first argument (got nothing instead) This makes metaprogramming more difficult, because it requires special case handling. I'm curious if it's somehow necessary by Python's semantics or just an artifact of implementation.

Read the article
Extended maxima transform in Matlab

- by garvin

I use imtophat to apply a filter to an m x n array. I then find the local max using imextendedmax(). I get mostly 0's everywhere except for 1's in the general areas where I am expecting a local max. The weird thing is, though, that I don't get just one local max. Instead in these places I get MANY elements with 1's such as 00011100000 00111111000 00000110000 yet the values there are close but NOT equal so I would expect that there would be one that is higher than all of the rest. So I'm wondering a) if this is a bug and how I might fix it and b) how you would choose choose the element of these 1's with the highest value.

Read the article
This is a test question to thtest the new 100 characters thing, we are now past 75 [closed]

- by Fred Garvin

here are some additional info

Read the article

1