Search Results

Search found 261 results on 11 pages for 'timers'.

Page 11/11 | < Previous Page | 7 8 9 10 11 

  • Constant game speed independent of variable FPS in OpenGL with GLUT?

    - by Nazgulled
    I've been reading Koen Witters detailed article about different game loop solutions but I'm having some problems implementing the last one with GLUT, which is the recommended one. After reading a couple of articles, tutorials and code from other people on how to achieve a constant game speed, I think that what I currently have implemented (I'll post the code below) is what Koen Witters called Game Speed dependent on Variable FPS, the second on his article. First, through my searching experience, there's a couple of people that probably have the knowledge to help out on this but don't know what GLUT is and I'm going to try and explain (feel free to correct me) the relevant functions for my problem of this OpenGL toolkit. Skip this section if you know what GLUT is and how to play with it. GLUT Toolkit: GLUT is an OpenGL toolkit and helps with common tasks in OpenGL. The glutDisplayFunc(renderScene) takes a pointer to a renderScene() function callback, which will be responsible for rendering everything. The renderScene() function will only be called once after the callback registration. The glutTimerFunc(TIMER_MILLISECONDS, processAnimationTimer, 0) takes the number of milliseconds to pass before calling the callback processAnimationTimer(). The last argument is just a value to pass to the timer callback. The processAnimationTimer() will not be called each TIMER_MILLISECONDS but just once. The glutPostRedisplay() function requests GLUT to render a new frame so we need call this every time we change something in the scene. The glutIdleFunc(renderScene) could be used to register a callback to renderScene() (this does not make glutDisplayFunc() irrelevant) but this function should be avoided because the idle callback is continuously called when events are not being received, increasing the CPU load. The glutGet(GLUT_ELAPSED_TIME) function returns the number of milliseconds since glutInit was called (or first call to glutGet(GLUT_ELAPSED_TIME)). That's the timer we have with GLUT. I know there are better alternatives for high resolution timers, but let's keep with this one for now. I think this is enough information on how GLUT renders frames so people that didn't know about it could also pitch in this question to try and help if they fell like it. Current Implementation: Now, I'm not sure I have correctly implemented the second solution proposed by Koen, Game Speed dependent on Variable FPS. The relevant code for that goes like this: #define TICKS_PER_SECOND 30 #define MOVEMENT_SPEED 2.0f const int TIMER_MILLISECONDS = 1000 / TICKS_PER_SECOND; int previousTime; int currentTime; int elapsedTime; void renderScene(void) { (...) // Setup the camera position and looking point SceneCamera.LookAt(); // Do all drawing below... (...) } void processAnimationTimer(int value) { // setups the timer to be called again glutTimerFunc(TIMER_MILLISECONDS, processAnimationTimer, 0); // Get the time when the previous frame was rendered previousTime = currentTime; // Get the current time (in milliseconds) and calculate the elapsed time currentTime = glutGet(GLUT_ELAPSED_TIME); elapsedTime = currentTime - previousTime; /* Multiply the camera direction vector by constant speed then by the elapsed time (in seconds) and then move the camera */ SceneCamera.Move(cameraDirection * MOVEMENT_SPEED * (elapsedTime / 1000.0f)); // Requests to render a new frame (this will call my renderScene() once) glutPostRedisplay(); } void main(int argc, char **argv) { glutInit(&argc, argv); (...) glutDisplayFunc(renderScene); (...) // Setup the timer to be called one first time glutTimerFunc(TIMER_MILLISECONDS, processAnimationTimer, 0); // Read the current time since glutInit was called currentTime = glutGet(GLUT_ELAPSED_TIME); glutMainLoop(); } This implementation doesn't fell right. It works in the sense that helps the game speed to be constant dependent on the FPS. So that moving from point A to point B takes the same time no matter the high/low framerate. However, I believe I'm limiting the game framerate with this approach. Each frame will only be rendered when the time callback is called, that means the framerate will be roughly around TICKS_PER_SECOND frames per second. This doesn't feel right, you shouldn't limit your powerful hardware, it's wrong. It's my understanding though, that I still need to calculate the elapsedTime. Just because I'm telling GLUT to call the timer callback every TIMER_MILLISECONDS, it doesn't mean it will always do that on time. I'm not sure how can I fix this and to be completely honest, I have no idea what is the game loop in GLUT, you know, the while( game_is_running ) loop in Koen's article. But it's my understanding that GLUT is event-driven and that game loop starts when I call glutMainLoop() (which never returns), yes? I thought I could register an idle callback with glutIdleFunc() and use that as replacement of glutTimerFunc(), only rendering when necessary (instead of all the time as usual) but when I tested this with an empty callback (like void gameLoop() {}) and it was basically doing nothing, only a black screen, the CPU spiked to 25% and remained there until I killed the game and it went back to normal. So I don't think that's the path to follow. Using glutTimerFunc() is definitely not a good approach to perform all movements/animations based on that, as I'm limiting my game to a constant FPS, not cool. Or maybe I'm using it wrong and my implementation is not right? How exactly can I have a constant game speed with variable FPS? More exactly, how do I correctly implement Koen's Constant Game Speed with Maximum FPS solution (the fourth one on his article) with GLUT? Maybe this is not possible at all with GLUT? If not, what are my alternatives? What is the best approach to this problem (constant game speed) with GLUT? I originally posted this question on Stack Overflow before being pointed out about this site. The following is a different approach I tried after creating the question in SO, so I'm posting it here too. Another Approach: I've been experimenting and here's what I was able to achieve now. Instead of calculating the elapsed time on a timed function (which limits my game's framerate) I'm now doing it in renderScene(). Whenever changes to the scene happen I call glutPostRedisplay() (ie: camera moving, some object animation, etc...) which will make a call to renderScene(). I can use the elapsed time in this function to move my camera for instance. My code has now turned into this: int previousTime; int currentTime; int elapsedTime; void renderScene(void) { (...) // Setup the camera position and looking point SceneCamera.LookAt(); // Do all drawing below... (...) } void renderScene(void) { (...) // Get the time when the previous frame was rendered previousTime = currentTime; // Get the current time (in milliseconds) and calculate the elapsed time currentTime = glutGet(GLUT_ELAPSED_TIME); elapsedTime = currentTime - previousTime; /* Multiply the camera direction vector by constant speed then by the elapsed time (in seconds) and then move the camera */ SceneCamera.Move(cameraDirection * MOVEMENT_SPEED * (elapsedTime / 1000.0f)); // Setup the camera position and looking point SceneCamera.LookAt(); // All drawing code goes inside this function drawCompleteScene(); glutSwapBuffers(); /* Redraw the frame ONLY if the user is moving the camera (similar code will be needed to redraw the frame for other events) */ if(!IsTupleEmpty(cameraDirection)) { glutPostRedisplay(); } } void main(int argc, char **argv) { glutInit(&argc, argv); (...) glutDisplayFunc(renderScene); (...) currentTime = glutGet(GLUT_ELAPSED_TIME); glutMainLoop(); } Conclusion, it's working, or so it seems. If I don't move the camera, the CPU usage is low, nothing is being rendered (for testing purposes I only have a grid extending for 4000.0f, while zFar is set to 1000.0f). When I start moving the camera the scene starts redrawing itself. If I keep pressing the move keys, the CPU usage will increase; this is normal behavior. It drops back when I stop moving. Unless I'm missing something, it seems like a good approach for now. I did find this interesting article on iDevGames and this implementation is probably affected by the problem described on that article. What's your thoughts on that? Please note that I'm just doing this for fun, I have no intentions of creating some game to distribute or something like that, not in the near future at least. If I did, I would probably go with something else besides GLUT. But since I'm using GLUT, and other than the problem described on iDevGames, do you think this latest implementation is sufficient for GLUT? The only real issue I can think of right now is that I'll need to keep calling glutPostRedisplay() every time the scene changes something and keep calling it until there's nothing new to redraw. A little complexity added to the code for a better cause, I think. What do you think?

    Read the article

  • My sound stopped working today, how can I fix it?

    - by Oli
    This seems to be a problem with pulseaudio. I was logged in over VNC on my phone and started playing a video this caused X to crash (as sometimes happens). I restarted and suddenly the sound doesn't work. I have a Intel HDA/Realtek ALC889 00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller alsamixer is detecting this just fine. PulseAudio doesn't detect this alsa device so is using auto_null as the default sink (logs below). When I properly kill PulseAudio (tell it not to auto-start) direct ALSA communication with the sound card works just fine. speaker-test, for example, works. So the hardware and ALSA layers are fine IMO. In the logs, it seems that the card might be "busy" but I really don't know how or why it would be now (and never before). Is there an ALSA lock file somewhere that it still there because of my crash? I just ran sudo fuser /dev/snd/* and saw this: oli@bert:~$ sudo fuser /dev/snd/* /dev/snd/controlC0: 1884 /dev/snd/pcmC0D0c: 1884m /dev/snd/timer: 1884 A look at the process list (ps aux | grep 1884) tells me process 1884 is arecord -c 1 -f S16_LE -r 8000 -t raw. No idea what this is or why it's running. When I try and kill arecord (as root), it just respawns and rebinds on the hardware. I'm in a very annoying situation where I don't know what is going on and don't know how to find out. I'm open to all suggestions to get this working again. Fire away. And here's what I get when I stop PA auto-loading, kill it and then start it with -vvvv. oli@bert:~$ pulseaudio -vvvvv I: main.c: setrlimit(RLIMIT_NICE, (31, 31)) failed: Operation not permitted I: main.c: setrlimit(RLIMIT_RTPRIO, (9, 9)) failed: Operation not permitted D: core-rtclock.c: Timer slack is set to 50 us. D: core-util.c: RealtimeKit worked. I: core-util.c: Successfully gained nice level -11. I: main.c: This is PulseAudio 0.9.21-63-gd3efa-dirty D: main.c: Compilation host: x86_64-pc-linux-gnu D: main.c: Compilation CFLAGS: -g -O2 -g -Wall -O3 -Wall -W -Wextra -pipe -Wno-long-long -Winline -Wvla -Wno-overlength-strings -Wunsafe-loop-optimizations -Wundef -Wformat=2 -Wlogical-op -Wsign-compare -Wformat-security -Wmissing-include-dirs -Wformat-nonliteral -Wold-style-definition -Wpointer-arith -Winit-self -Wdeclaration-after-statement -Wfloat-equal -Wmissing-prototypes -Wstrict-prototypes -Wredundant-decls -Wmissing-declarations -Wmissing-noreturn -Wshadow -Wendif-labels -Wcast-align -Wstrict-aliasing=2 -Wwrite-strings -Wno-unused-parameter -ffast-math -Wp,-D_FORTIFY_SOURCE=2 -fno-common -fdiagnostics-show-option D: main.c: Running on host: Linux x86_64 2.6.38-rc3 #1 SMP Tue Feb 1 10:53:04 GMT 2011 D: main.c: Found 8 CPUs. I: main.c: Page size is 4096 bytes D: main.c: Compiled with Valgrind support: no D: main.c: Running in valgrind mode: no D: main.c: Running in VM: no D: main.c: Optimised build: yes D: main.c: All asserts enabled. I: main.c: Machine ID is 8310740c4729ef474fe5ecec4bbf5a6b. I: main.c: Session ID is 8310740c4729ef474fe5ecec4bbf5a6b-1297338553.571075-1050119523. I: main.c: Using runtime directory /home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-runtime. I: main.c: Using state directory /home/oli/.pulse. I: main.c: Using modules directory /usr/lib/pulse-0.9.21/modules. I: main.c: Running in system mode: no I: main.c: Fresh high-resolution timers available! Enjoy ol' chap! I: cpu-x86.c: CPU flags: CMOV MMX SSE SSE2 SSE3 SSSE3 SSE4_1 SSE4_2 I: svolume_mmx.c: Initialising MMX optimized functions. I: remap_mmx.c: Initialising MMX optimized remappers. I: svolume_sse.c: Initialising SSE2 optimized functions. I: remap_sse.c: Initialising SSE2 optimized remappers. I: sconv_sse.c: Initialising SSE2 optimized conversions. D: memblock.c: Using shared memory pool with 1024 slots of size 64.0 KiB each, total size is 64.0 MiB, maximum usable slot size is 65472 D: database-tdb.c: Opened TDB database '/home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-device-volumes.tdb' I: module-device-restore.c: Sucessfully opened database file '/home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-device-volumes'. I: module.c: Loaded "module-device-restore" (index: #0; argument: ""). D: database-tdb.c: Opened TDB database '/home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-stream-volumes.tdb' I: module-stream-restore.c: Sucessfully opened database file '/home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-stream-volumes'. I: module.c: Loaded "module-stream-restore" (index: #1; argument: ""). D: database-tdb.c: Opened TDB database '/home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-card-database.tdb' I: module-card-restore.c: Sucessfully opened database file '/home/oli/.pulse/8310740c4729ef474fe5ecec4bbf5a6b-card-database'. I: module.c: Loaded "module-card-restore" (index: #2; argument: ""). I: module.c: Loaded "module-augment-properties" (index: #3; argument: ""). D: cli-command.c: Checking for existance of '/usr/lib/pulse-0.9.21/modules/module-udev-detect.so': success D: module-udev-detect.c: /dev/snd/controlC0 is accessible: yes D: module-udev-detect.c: /devices/pci0000:00/0000:00:1b.0/sound/card0 is busy: yes I: module-udev-detect.c: Found 1 cards. I: module.c: Loaded "module-udev-detect" (index: #4; argument: ""). D: cli-command.c: Checking for existance of '/usr/lib/pulse-0.9.21/modules/module-bluetooth-discover.so': success D: dbus-util.c: Successfully connected to D-Bus system bus ba7c9a1f90b3d49d930bca2100000015 as :1.62 D: bluetooth-util.c: dbus: interface=org.freedesktop.DBus, path=/org/freedesktop/DBus, member=NameAcquired D: bluetooth-util.c: Bluetooth daemon is apparently not available. I: module.c: Loaded "module-bluetooth-discover" (index: #5; argument: ""). D: cli-command.c: Checking for existance of '/usr/lib/pulse-0.9.21/modules/module-esound-protocol-unix.so': success I: module.c: Loaded "module-esound-protocol-unix" (index: #6; argument: ""). I: module.c: Loaded "module-native-protocol-unix" (index: #7; argument: ""). D: cli-command.c: Checking for existance of '/usr/lib/pulse-0.9.21/modules/module-gconf.so': success I: module.c: Loaded "module-gconf" (index: #8; argument: ""). I: module-default-device-restore.c: Saved default sink 'auto_null' not existant, not restoring default sink setting. I: module-default-device-restore.c: Saved default source 'auto_null.monitor' not existant, not restoring default source setting. I: module.c: Loaded "module-default-device-restore" (index: #9; argument: ""). I: module.c: Loaded "module-rescue-streams" (index: #10; argument: ""). D: module-always-sink.c: Autoloading null-sink as no other sinks detected. I: sink.c: Created sink 0 "auto_null" with sample spec s16le 6ch 44100Hz and channel map front-left,front-left-of-center,front-center,front-right,front-right-of-center,rear-center I: sink.c: device.description = "Dummy Output" I: sink.c: device.class = "abstract" I: sink.c: device.icon_name = "audio-card" D: core-subscribe.c: Dropped redundant event due to change event. I: source.c: Created source 0 "auto_null.monitor" with sample spec s16le 6ch 44100Hz and channel map front-left,front-left-of-center,front-center,front-right,front-right-of-center,rear-center I: source.c: device.description = "Monitor of Dummy Output" I: source.c: device.class = "monitor" I: source.c: device.icon_name = "audio-input-microphone" D: module-null-sink.c: Thread starting up I: module.c: Loaded "module-null-sink" (index: #11; argument: "sink_name=auto_null sink_properties='device.description="Dummy Output"'"). I: module.c: Loaded "module-always-sink" (index: #12; argument: ""). I: module.c: Loaded "module-intended-roles" (index: #13; argument: ""). D: module-suspend-on-idle.c: Sink auto_null becomes idle, timeout in 5 seconds. I: module.c: Loaded "module-suspend-on-idle" (index: #14; argument: ""). I: client.c: Created 0 "ConsoleKit Session /org/freedesktop/ConsoleKit/Session1" D: module-console-kit.c: Added new session /org/freedesktop/ConsoleKit/Session1 I: module.c: Loaded "module-console-kit" (index: #15; argument: ""). I: module.c: Loaded "module-position-event-sounds" (index: #16; argument: ""). D: dbus-util.c: Successfully connected to D-Bus session bus efbffc6788fad56cfd64d40c00000018 as :1.182 D: main.c: Got org.pulseaudio.Server! I: main.c: Daemon startup complete. I: client.c: Created 1 "Native client (UNIX socket client)" I: client.c: Created 2 "Native client (UNIX socket client)" D: protocol-native.c: Protocol version: remote 16, local 16 I: protocol-native.c: Got credentials: uid=1000 gid=1000 success=1 D: protocol-native.c: SHM possible: yes D: protocol-native.c: Negotiated SHM: yes D: protocol-native.c: Protocol version: remote 16, local 16 I: protocol-native.c: Got credentials: uid=1000 gid=1000 success=1 D: protocol-native.c: SHM possible: yes D: protocol-native.c: Negotiated SHM: yes D: module-augment-properties.c: Looking for .desktop file for gnome-volume-control-applet D: module-augment-properties.c: Looking for .desktop file for gnome-settings-daemon D: core-subscribe.c: Dropped redundant event due to change event. I: module-suspend-on-idle.c: Sink auto_null idle for too long, suspending ... D: sink.c: Suspend cause of sink auto_null is 0x0004, suspending Note the one section that seems to find the hardware but says it's busy (no idea if this is relevant). D: cli-command.c: Checking for existance of '/usr/lib/pulse-0.9.21/modules/module-udev-detect.so': success D: module-udev-detect.c: /dev/snd/controlC0 is accessible: yes D: module-udev-detect.c: /devices/pci0000:00/0000:00:1b.0/sound/card0 is busy: yes I: module-udev-detect.c: Found 1 cards.

    Read the article

  • CodePlex Daily Summary for Sunday, May 18, 2014

    CodePlex Daily Summary for Sunday, May 18, 2014Popular ReleasesClosedXML - The easy way to OpenXML: ClosedXML 0.70.0: A lot of fixes. See history.TBox - tool to make developer's life easier.: TBox 1.29: Bug fixing. Add LocalizationTool pluginYAXLib: Yet Another XML Serialization Library for the .NET Framework: YAXLib 2.13: Fixed a bug and added unit tests related to serializing path like aliases with one letter (e.g., './B'). Thanks go to CodeProject user B.O.B. for reporting this bug. Added `Bin/*.dll.mdb` to `.gitignore`. Fixed the issue with Indexer properties. Indexers must not be serialized/deserialized. YAXLib will ignore delegate (callback/function pointer) properties, so that the exception upon serializing them is prevented. Significant improve in cycling object reference detection Self Referr...SFDL.NET: SFDL.NET (2.2.9.2): Changelog: Neues Icon Xup.in CnL Plugin BugfixSEToolbox: SEToolbox 01.030.008 Release 1: Fixed cube editor failing to apply color to cubes. Added to cube editor, replace cube dialog, and Build Percent dialog. Corrected for hidden asteroid ore, allowing rare ore to show when importing an asteroid, or converting a 3d model to an asteroid (still appears to be limitations on rare ore in small asteroids). Allowed ore selection to Asteroid file import. (Can copy/import and convert existing asteroid to another ore). Added progress bars to common long running operations. Fixed ...Better Robocopy GUI: Command Line GUI for Robocopy: Better Robocopy GUI had become the primary plugin in Command Line GUI built on .NET 4Mini SQL Query: Mini SQL Query (1.0.71.456): Minor fixes and template corrections.Visual Studio Settings Switcher: Settings Switcher 1.1: Settings Switcher is compatible with Visual Studio 2012 and Visual Studio 2013. Express editions are not supported. NewFull support for Visual Studio 2013. Solution Settings Files (see Documentation for details.) Bug fixes and general usability improvements. There are two ways to install Settings Switcher: DOWNLOAD FROM CODEPLEXDownload the installer (.vsix) from the link above. Close all instances of Visual Studio. Double-click the .vsix file to install Settings Switcher. DOWNLOA...SharePoint Online Automation Cmdlets: Apps, Solutions and Permissions: Solutions can now be activated/deactivated/updated :-) See documentation for examples. Added Add-SPOApp and Install-SPOApp for uploading and activating apps on non-developer site collections. Also adding groups and permission levels has been included. Install Instructions Install the SharePoint Online Client components. Download and run the MSI file from the downloads section.TFS Planning and Disaster Recovery Avoidance Guide: v1.4.BETA - TFS, DR and Azure IaaS Planning Guides: Welcome to the TFS Planning and DR Avoidance Guidance What is new? A new crisper, more compact style, which is easier to consume on multiple devices without sacrificing any content. Also included are the new TFS on Azure IaaS guide and supplementary guides. Note Capacity planning workbook and posters are included in the Everything Zip package. Quality-Bar Detail Documentation has been reviewed by Visual Studio ALM Rangers Documentation has been through an independent technical review ...CRM Web API Lead Capture Example: CRM Web API Lead Capture Example: Sample Visual Studio 2013 project that provides an example of how you could use Web API and Microsoft Azure (could be deployed anywhere) to capture simple HTML form data into Dynamics CRM without directly having to integrate a .NET component.MB Tools: MDT Monitor Tool v1.4: This tool is used to connect to an MDT 2013 Monitor Webservice. The purpose is to provide an alternative to the MDT Deployment Workbench. Update: New in v1.4: Fixed bug where Dart Remote Viewer didnt work Option to show client local time instead of UTC, edit config.xml to enable/disable New in v1.2: Fixed Dart Remote Viewer not connection to full ip Issue: 1222 New in v1.1: Added timers for autorefresh of webservice info Added some better errorchecking and cleaned up the code a bit ...WinAudit: WinAudit Freeware v3.0: WinAudit.exe v3.0 MD5: 88750CCF49FF7418199B2645755830FA Known Issues: 1. Report creation can be very slow when right-to-left (Hebrew) characters are present. 2. Emsisoft Anti-Malware may stop and/or quarantine WinAudit. This happens when WinAudit attempts to obtain a list if running programmes. You will need to set an exception rule in Emsisoft to allow WinAudit to run.TerraMap (Terraria World Map Viewer): TerraMap 1.0.4: Added support for the new Terraria v1.2.4 update. New items, walls, and tiles Fixed Issue 35206: Hightlight/Find doesn't work for Demon Altars Fixed finding Demon Hearts/Shadow Orbs Added ability to find Enchanted Swords (in the stone) and Water Bolt books Fixed installer not uninstalling older versions The setup file will make sure .NET 4 is installed, install TerraMap, create desktop and start menu shortcuts, add a .wld file association, and launch TerraMap. If you prefer the zip ...Amqp.Net Lite: 0.1: This is the Alpha-quality release of the AMQP.Net Lite library.TSS.MSR: TSS.MSR v1.1: MSR's TPM2.0 access libraries and sample applications.WPF Localization Extension: v2.2.1: Issue #9277 Issue #9292 Issue #9311 Issue #9312 Issue #9313 Issue #9314Hime Parser Generator: Hime Parser Generator v1.0.0: This releases the stable version of the Hime parser generator. This release contains many bugfixes and performance enhancement. It also provides a clean API for the manipulation and debugging of context-free grammars.CtrlAltStudio Viewer: CtrlAltStudio Viewer 1.2.1.41167 Release: This release of the CtrlAltStudio Viewer includes the following significant features: Oculus Rift support. Stereoscopic 3D display support. Variable walking / flying speed. Xbox 360 Controller support. Kinect for Windows support. Based on Firestorm viewer 4.6.5 codebase. For more details, see the release notes linked to below. Release notes: http://ctrlaltstudio.com/viewer/release-notes/1-2-1-41167-release Support info: http://ctrlaltstudio.com/viewer/support Privacy policy: http:/...ExtJS based ASP.NET Controls: FineUI v4.0.6: FineUI(???) ?? ExtJS ??? ASP.NET ??? FineUI??? ?? No JavaScript,No CSS,No UpdatePanel,No ViewState,No WebServices ??????? ?????? IE 8.0+、Chrome、Firefox、Opera、Safari ???? Apache License v2.0 ?:ExtJS ?? GPL v3 ?????(http://www.sencha.com/license) ???? ??:http://fineui.com/ ??:http://fineui.com/bbs/ ??:http://fineui.com/demo/ ??:http://fineui.com/doc/ ??:http://fineui.codeplex.com/ FineUI ???? ExtJS ????????,???? ExtJS ?,?????: 1. ????? FineUI ? ExtJS ? http://fineui.com/bbs/forum.ph...New ProjectsCheburashka: Static Code Analysis Rule-set for Visual Studio SSDT projectsFree Workflow: Free Workflow Project aim to use Microsoft Workflow Manager as hosting environment to Business Process Workflow , it has its own database to manage users tasks.Jenkins Tray: Jenkins tray real time notifyMoneyManagement: ?? WCF ??????????????WCF?????。??????????????????????????????。MoonSharp: An interpreter for a very close cousin of the Lua language, written in C# for the .NET, Mono, Xamarin and Unity3D platforms.OITPMS_MVC: Online Issue Tracking and Project Management System This project was created for Bug tracking and issues that comes while creating project or on going projectSpiral Chrome: Spiral Chrome, the quick, simple, and user friendly method of phishing. Throttling Suite for Web API: The Throttling Suite provides throttling control capabilities to the .NET Web API applications. It is highly customizable product, yet simple to use.TP2 .NET: Trabajo practico 2 para la clase de .net??????-??????【??】??????????: ?????????????????,???????????????。???????????,??????:????、????、???????! ????-????【??】????????: ????????????????,???????????,??????????????,??????????,??????????????!?????-?????【??】?????????: ??????????????????,?????????????,????,?????????,?????????????,?????,?????! ??????-??????【??】??????????: ??????????????????、????,??100%????,??????,????????????,???????????! ????-????【??】????????: ?????????????????????:????、????、??????????????,????????。????????! ?????-?????【??】?????????: ???????????????????,??????????,????????、????,??????????,??????????。 ?????-?????【??】?????????: ???????,??????:?????,?????,??????,??????????,????????。????????! ??????-??????【??】??????????: ?????????????????,??????????、??????,??????????、????、????、???????。 ??????-??????【??】??????????: ?????????????、??????????????????,????????,?????,??????,????,????,????! ????-????【??】????????: ???????????????????????????、????、????、???????????,????,????! ?????-?????【??】?????????: ?????????????????????,???????????????,????????????????????! ?????-?????【??】?????????: ?????????????????????,???????????????,???????,?????,?????,????? !!! ??????-??????【??】??????????: ????????????????????,????????:??、??、???,?????????????????????! ??????-??????【??】??????????: ??????????????????????,????“???????,???????”?????,????????????! ??????-??????【??】??????????: ???????????,????,????,??????,????“????、????、????、????”????????,??????. ??????-??????【??】??????????: ?????????????,?????,???????????,???????,????,????,????,?????。 ?????-?????【??】?????????: ????????????,?????,???????????,???????,????,????,????,?????。 ??????-??????【??】??????????: ?????????????????????????????,??????????,????,????,?????????、??????,??????。 ????-????【??】????????: ????????????????????????,????,????,??????????。???????????????,??,??,??????????,??????... ?????-?????【??】?????????: ???????????????,?????????????? ??。????????、????、????、?????????? ???????。 ??????-??????【??】??????????: ????????????、???、??、??????????????????????????????,????????????????! ????-????【??】????????: ??????????????6?,???????????????????????????,??????????????,?????????! ?????-?????【??】?????????: ???????????????8?,????????,????????,??????????,?????,????? ,????????!

    Read the article

  • Prepping a conference

    - by Laurent Bugnion
    I have had the chance to talk at many conferences these past few years, and came up with a way to prepare them which works really well for me. Most importantly, it would make it quite easy to overcome an emergency (for example if my laptop would suddenly lose data). The whole code as well as the slides and other documents are in the cloud. I also use source control for my demos, so that I always have the latest and the greatest, but also a history of changes I made to my demos. Finally I have a system of code snippets which works great, and I often had very positive remarks from the audience regarding that. Putting everything in the cloud The one thing I used to be the most scared of was a sudden crash of my laptop, and being unable to restore in time for a conference. Most conferences ask speakers to send slides a few days (or weeks…) in advance, but let's face it, we all have last minute changes to our talks and I always come in the conference with updated slides that I pass to the management team. The answer to that dilemma used to be working off memory sticks, and that worked not bad. However last year I started putting all the documents relating to a conference in a DropBox folder, and that works great too. Obviously DropBox works only if you have connectivity, so if I for instance update slides while on an international flight, I cannot save to the cloud. The obvious answer to that is to backup everything on a memory stick… but I have to admit, I have been trusting my luck and working off my laptop HD and then synching everything to the cloud after landing. Of course on some US national flights you get WiFi on board, so in that case it is even simpler :) Usually after the conference is done, I remove the files from DropBox and copy them to their "final destination". They are backed up from there to BackBlaze, the great online backup service I am using routinely (I currently have about 90GB of data in BackBlaze). Outlining the presentations I like to have a written outline of my presentations written somewhere. I keep it simple, just write the various sections of the presentation with timing. I guess it is a remnant of the time when I was a private pilot, and using checklists for flight preparation. For example: Demo about designability 15' (0:37) Switch to Blend Open MainPage.xaml Create a DataTemplate ... Here I can immediately see during the presentation if I am taking too much time for my demo (0:37 is where I need to be when I am done with this section of the presentation, and 15' is the time that this particular section takes). I keep these sections reasonable, I don't detail every step of the preparation. Typically I have one such section for every 10-15 minutes of my talks. Yes, I am timing my presentations. I keep adjusting these numbers when I rehearse, and this really helps to feel more confident during the presentations. This is especially important for presentations that are long, like my MIX11 demo which clocked at 57 minutes (I had a lot of stuff to show…). Such presentations are risky, because if anything goes wrong, you will have to cut stuff, so the answer to that is: Rehearse, rehearse and when you're done rehearsing, rehearse a little more. I also have a "Preparation" section where I outline what I need to do before a presentation. For instance: Preparation Reboot in VHD Make sure MSN and Twitter are not running. Open VS10 and load demo Open Blend and load demo Run the WP7 emulator ... I typically start preparing my laptop an hour before the talk, starting everything I need to start and then putting my laptop to sleep. Saving and printing the outline, Timing Printing is a real problem because it is really hard to find a printer at most conference venues, and also quite hard in hotels. To solve that, I simply write everything in OneNote (synched to the cloud, now you start to know what I like ;) and then I print it to a PDF (I use CutePDFWriter) that I save to my Kindle. During the presentation, I read the outline off the Kindle (I mostly just need a quick check to see how I am timing). For timing during the presentation, I use the free tool ChronoGPS on my Windows Phone 7, but of course any phone these days has a clock/chrono application. In some conferences, they even have timers that the presenters can see, but they tend to count down and I prefer to count up… so I just use my own :) Source control for demos For demos, I create a separate folder and use Mercurial as source control. Mercurial has the huge advantage (over SVN or TFS) to work offline too, so I can commit while on a plane, and all the history is saved. Then when I have connectivity I push everything to the cloud (I am using the fantastic Trunksapp.com for my private repositories). Here too the obvious downside is the risk of losing my last changes if my laptop crashes before I can push to the cloud, and here too the obvious answer would be to work from a memory stick… though I have to admit I didn't do that lately (except when I was writing Silverlight 4 Unleashed, where I was really paranoid…) And code snippets? I am one of these presenters who hates to type in front of an audience. I can type really fast (writing two books has this advantage, it really teaches you to touch type and be fast at it) but in the context of an audience, on a stage where it is often damn cold (an issue I had a lot in past conferences, air conditioning can freeze your fingers and make it really hard to type), it doesn't work as well. I don't know for you, but I really dislike seeing a presentation where the speaker uses the backspace key more often than others ;) To solve that, I like to have my code ready in snippets, and drag them to the screen. Then I can spend time explaining each code snippet, while highlighting portions of the code (always highlight what you talk about, the audience often doesn't even see the cursor and doesn't know where you are on the screen!) Over the years I have used various solutions for code snippets, and now I have one which works really well… if you take a few precautions! I use the Visual Studio Toolbox. Preparing the code snippets You can store code snippets in the Toolbox for anything, XAML, C# etc. I arrange the snippets in the order in which I need them, which is a great way to remember what comes next in the presentation. I also separate them by topic, to make it easier to find them, for example when I switch to the slides and then back to the code. Remember that no matter how experienced you are, you will feel more nervous on stage than while you are preparing, so any way to make it easier for you is going to be beneficial to the audience. To store a code snippet, I do the following: Open the final demo that you want to show to the audience in Visual Studio. In your code, select a snippet of code that you want to explain in particular. Make sure that the Visual Studio Toolbox is open (menu View, Toolbox or Ctrl-Alt-X). Drag the selected snippet from the code window to the toolbox. (if needed) drag the snippet to the correct location (for example between two other code snippets so that you can access it as you speak through the demo). Right click on the snippet and select Rename Item from the context menu. Select a meaningful name. For me I use the following conventions: If it is a method, I use the method's name. If it is not a whole method, I use a descriptive name. If it is the content of a method (i.e. the body only, without the method's signature), I use "-> MethodName". This reminds me during the presentation that this is only the body, and that I need to insert that into an existing signature. This is the case, for instance, when I use Visual Studio to automatically generate the members of an interface’s implementation; then I only need to insert my snippet inside the generated method body. Saving the snippets This is the most important!! It happened to me a few times that VS10 lost its settings. When that happens, the snippets are lost too! Yeah that really sucks, especially (as it happened once) when this is the case about an hour before a talk… Stress and sweat follows, not good conditions to start a talk in front of an audience believe me. Thankfully, saving snippets is really easy with the following steps: Select the menu Tools, Import and Export Settings. Select Export selected environment settings and press Next. Uncheck All Settings. Then expand General Settings and select Toolbox (only!). Press Next. Select your source control folder and save under a meaningful name (for instance Snippets.vssettings). Commit to source control and push to the cloud. By the way, this also has the advantage of applying source control to the snippets file (which is an XML file), so you get history for free on that file! Reimporting the snippets If VS loses its settings and you need to reimport the snippets, this can be done super easily and very fast. Make sure that the Toolbox is empty. When you import snippets, they are merged with existing ones, they do not replace the content of the Toolbox. Unless merging is really what you want, make sure that your Toolbox is clean before you import, it is really easier. Select the menu Tools, Import and Export Settings. Select Import selected environment settings and press Next. Select No, just import new settings and press Next. Press Browse and select the Snippets.vssettings file. Press Finish. Et voila, all your snippets appear again in the Toolbox. Whew, the worst was averted and you can start your demo without sweating! (I had to do that once literally 5 minutes before the start of a demo, while my laptop was already hooked to the projector, and it went just fine). What about special tools? When using special tools (for example beta versions of tools you have an early access to), or a special configuration of your laptop, things can get tricky because you cannot really be sure that you will get a laptop with the same tools and the same configuration at the conference. To solve that, I use the following precautions: I make my demos from a Virtual Hard Disk. The great John Papa made a very easy-to-follow web page where he explains how to create a VHD and install Win7 to it. This gives you the full power of your laptop (as fast as booting from the metal). For me, I have a basic configuration that I saved on a USB harddrive (Win7 plus drivers, basic settings for desktop, folder options, taskbar etc) and Visual Studio 2010 SP1 on it. When preparing, I start by copying this "basis VHD" to my laptop. I install additional tools and configurations. I save the VHD back to the USB harddrive in a different folder. This would allow me to reinstall my demo environment quite fast, for example in case of harddrive failure. Replace the harddrive, copy the VHD to it, configure the BCD and you can start. Unfortunately this only works if the laptop itself still works. In the worst case of total failure, my security is to back all the installers up: The installers I use are synched on all my laptops and backed up to BackBlaze. If the worst happens and my laptop is absolutely broken, I can download the installer from BackBlaze and install on another laptop. This of course takes some time, and if that happens 5 minutes before a presentation, well… I don't have an answer to that, except of course crossing my fingers. Still, all that gives me additional security. Conclusion Remember folks, talking to an audience, large or small, will make you nervous. Just ask Scott Hanselman :) The goal here is to create the best possible conditions for you, and to create an environment where everything is saved and easy to restore, where everything is well known and provides you with additional confidence. The cooler you feel before the presentation (and during ;)), the better your presentation will be. Here too, the goal is to provide the best user experience you can have, which in turn will make it more enjoyable for your audience! Happy presenting :) Laurent   Laurent Bugnion (GalaSoft) Subscribe | Twitter | Facebook | Flickr | LinkedIn

    Read the article

  • Why is .NET faster than C++ in this case?

    - by acidzombie24
    -edit- I LOVE SLaks comment. "The amount of misinformation in these answers is staggering." :D Calm down guys. Pretty much all of you were wrong. I DID make optimizations. It turns out whatever optimizations I made wasn't good enough. I ran the code in GCC using gettimeofday (I'll paste code below) and used g++ -O2 file.cpp and got slightly faster results then C#. Maybe MS didn't create the optimizations needed in this specific case but after downloading and installing mingw I was tested and found the speed to be near identical. Justicle Seems to be right. I could have sworn I use clock on my PC and used that to count and found it was slower but problem solved. C++ speed isn't almost twice as slower in the MS compiler. When my friend informed me of this I couldn't believe it. So I took his code and put some timers onto it. Instead of Boo I used C#. I constantly got faster results in C#. Why? The .NET version was nearly half the time no matter what number I used. C++ version: #include <iostream> #include <stdio.h> #include <intrin.h> #include <windows.h> using namespace std; int fib(int n) { if (n < 2) return n; return fib(n - 1) + fib(n - 2); } int main() { __int64 time = 0xFFFFFFFF; while (1) { int n; //cin >> n; n = 41; if (n < 0) break; __int64 start = __rdtsc(); int res = fib(n); __int64 end = __rdtsc(); cout << res << endl; cout << (float)(end-start)/1000000<<endl; break; } return 0; } C# version: using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Runtime.InteropServices; using System.ComponentModel; using System.Threading; using System.IO; using System.Diagnostics; namespace fibCSTest { class Program { static int fib(int n) { if (n < 2)return n; return fib(n - 1) + fib(n - 2); } static void Main(string[] args) { //var sw = new Stopwatch(); //var timer = new PAB.HiPerfTimer(); var timer = new Stopwatch(); while (true) { int n; //cin >> n; n = 41; if (n < 0) break; timer.Start(); int res = fib(n); timer.Stop(); Console.WriteLine(res); Console.WriteLine(timer.ElapsedMilliseconds); break; } } } } GCC version: #include <iostream> #include <stdio.h> #include <sys/time.h> using namespace std; int fib(int n) { if (n < 2) return n; return fib(n - 1) + fib(n - 2); } int main() { timeval start, end; while (1) { int n; //cin >> n; n = 41; if (n < 0) break; gettimeofday(&start, 0); int res = fib(n); gettimeofday(&end, 0); int sec = end.tv_sec - start.tv_sec; int usec = end.tv_usec - start.tv_usec; cout << res << endl; cout << sec << " " << usec <<endl; break; } return 0; }

    Read the article

  • Thread Blocks During Call

    - by user578875
    I have a serious problem, I'm developing an application that mesures on call time during a call; the problem presents when, with the phone on the ear, the thread that the timer has, blocks and no longer responds before taking off my ear. The next log shows the problem. 01-11 16:14:19.607 14558 14566 I Estado : postDelayed Async Service 01-11 16:14:20.607 14558 14566 I Estado : postDelayed Async Service 01-11 16:14:21.607 14558 14566 I Estado : postDelayed Async Service 01-11 16:14:22.597 14558 14566 I Estado : postDelayed Async Service 01-11 16:14:23.608 14558 14566 I Estado : postDelayed Async Service 01-11 16:14:24.017 1106 1106 D iddd : select() < 0, Probably a handled signal: Interrupted system call 01-11 16:14:24.607 14558 14566 I Estado : postDelayed Async Service 01-11 16:18:05.500 1106 1106 D iddd : select() < 0, Probably a handled signal: Interrupted system call 01-11 16:18:06.026 14558 14566 I Estado : postDelayed Async Service 01-11 16:18:06.026 14558 14566 I Estado : postDelayed Async Service 01-11 16:18:06.026 14558 14566 I Estado : postDelayed Async Service 01-11 16:18:06.026 14558 14566 I Estado : postDelayed Async Service 01-11 16:18:06.026 14558 14566 I Estado : postDelayed Async Service 01-11 16:18:06.026 14558 14566 I Estado : postDelayed Async Service I've been trying with Services, Timers, Threads, AyncTasks and they all present the same problem. My Code: @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); requestWindowFeature(Window.FEATURE_NO_TITLE); getWindow().setFlags(WindowManager.LayoutParams.FLAG_FULLSCREEN, WindowManager.LayoutParams.FLAG_FULLSCREEN); setContentView(R.layout.main); HangUpService.setMainActivity(this); objHangUpService = new Intent(this, HangUpService.class); Runnable rAccion = new Runnable() { public void run() { TelephonyManager tm = (TelephonyManager)getSystemService(TELEPHONY_SERVICE); tm.listen(mPhoneListener, PhoneStateListener.LISTEN_CALL_STATE); objVibrator = (Vibrator) getSystemService(getApplicationContext().VIBRATOR_SERVICE); final ListView lstLlamadas = (ListView) findViewById(R.id.lstFavoritos); final EditText txtMinutos = (EditText) findViewById(R.id.txtMinutos); final EditText txtSegundos = (EditText) findViewById(R.id.txtSegundos); ArrayList<Contacto> cContactos = new ArrayList<Contacto>(); ContactoAdapter caContactos = new ContactoAdapter(HangUp.this, R.layout.row,cContactos); Cursor curContactos = getContentResolver().query( ContactsContract.Contacts.CONTENT_URI, null, null, null, ContactsContract.Contacts.TIMES_CONTACTED + " DESC"); while (curContactos.moveToNext()){ String strNombre = curContactos.getString(curContactos.getColumnIndex(ContactsContract.Contacts.DISPLAY_NAME)); String strID = curContactos.getString(curContactos.getColumnIndex(ContactsContract.Contacts._ID)); String strHasPhone=curContactos.getString(curContactos.getColumnIndex(ContactsContract.Contacts.HAS_PHONE_NUMBER)); String strStarred=curContactos.getString(curContactos.getColumnIndex(ContactsContract.Contacts.STARRED)); if (Integer.parseInt(strHasPhone) > 0 && Integer.parseInt(strStarred) ==1 ) { Cursor CursorTelefono = getContentResolver().query( ContactsContract.CommonDataKinds.Phone.CONTENT_URI, null, ContactsContract.CommonDataKinds.Phone.CONTACT_ID +" = " + strID, null, null); while (CursorTelefono.moveToNext()) { String strTipo=CursorTelefono.getString(CursorTelefono.getColumnIndex(ContactsContract.CommonDataKinds.Phone.TYPE)); String strTelefono=CursorTelefono.getString(CursorTelefono.getColumnIndex(ContactsContract.CommonDataKinds.Phone.NUMBER)); strNumero=strTelefono; String args[]=new String[1]; args[0]=strNumero; Cursor CursorCallLog = getContentResolver().query( android.provider.CallLog.Calls.CONTENT_URI, null, android.provider.CallLog.Calls.NUMBER + "=?", args, android.provider.CallLog.Calls.DATE+ " DESC"); if (Integer.parseInt(strTipo)==2) { caContactos.add( new Contacto( strNombre, strTelefono ) ); } } CursorTelefono.close(); } } curContactos.close(); lstLlamadas.setAdapter(caContactos); lstLlamadas.setOnItemClickListener(new OnItemClickListener() { @Override public void onItemClick(AdapterView a, View v, int position, long id) { Contacto mContacto=(Contacto)lstLlamadas.getItemAtPosition(position); i = new Intent(HangUp.this, Llamada.class); Log.i("Estado","Declaro Intent"); Bundle bundle = new Bundle(); bundle.putString("telefono", mContacto.getTelefono()); i.putExtras(bundle); startActivityForResult(i,SUB_ACTIVITY_ID); Log.i("Estado","Inicio Intent"); blActivo=true; try { String strMinutos=txtMinutos.getText().toString(); String strSegundos=txtSegundos.getText().toString(); if(!strMinutos.equals("") && !strSegundos.equals("")){ int Tiempo = ( (Integer.parseInt(txtMinutos.getText().toString())*60) + Integer.parseInt(txtSegundos.getText().toString()) )* 1000; handler.removeCallbacks(rVibrate); cTime = System.currentTimeMillis(); cTime=cTime+Tiempo; objHangUpAsync = new HangUpAsync(cTime,objVibrator,objPowerManager,objKeyguardLock); objHangUpAsync.execute(); objPowerManager.userActivity(Tiempo+3000, true); objHangUpService.putExtra("cTime", cTime); //startService(objHangUpService); } catch (Exception e) { e.printStackTrace(); } finally { } } }); } }; } AsyncTask: @Override protected String doInBackground(String... arg0) { blActivo = true; mWakeLock = objPowerManager.newWakeLock(PowerManager.FULL_WAKE_LOCK, "My Tag"); objKeyguardLock.disableKeyguard(); Log.i("Estado", "Entro a doInBackground"); timer.scheduleAtFixedRate( new TimerTask() { public void run() { if (blActivo){ if (cTime blActivo=false; objVibrator.vibrate(1000); Log.i("Estado","Vibrar desde Async"); this.cancel(); }else{ try{ mWakeLock.acquire(); mWakeLock.release(); Log.i("Estado","postDelayed Async Service"); }catch(Exception e){ Log.i("Estado","Error: " + e.getMessage()); } } } } }, 0, INTERVAL); return null; }

    Read the article

  • OpenGL true coordinates and glutTimerFunc() problem C++

    - by Meko
    HI I am starting to learn openGl for C++.but at stating point I stucked. I have 2 question that is the coordinates for drawing some objects? I mean where is X, Y and Z? Second one I am making tutorial from some sites. and I am trying to animate my triangle.In tutorial it works but on my computer not.I Also downloaded source codes but It doesnt move. Here sample codes. I thougt that problem is glutTimerFunc(). #include #include #ifdef APPLE #include #include #else #include #endif using namespace std; //Called when a key is pressed void handleKeypress(unsigned char key, int x, int y) { switch (key) { case 27: //Escape key exit(0); } } //Initializes 3D rendering void initRendering() { glEnable(GL_DEPTH_TEST); } //Called when the window is resized void handleResize(int w, int h) { glViewport(0, 0, w, h); glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluPerspective(45.0, (double)w / (double)h, 1.0, 200.0); } float _angle = 30.0f; float _cameraAngle = 0.0f; //Draws the 3D scene void drawScene() { glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); glMatrixMode(GL_MODELVIEW); //Switch to the drawing perspective glLoadIdentity(); //Reset the drawing perspective glRotatef(-_cameraAngle, 0.0f, 1.0f, 0.0f); //Rotate the camera glTranslatef(0.0f, 0.0f, -5.0f); //Move forward 5 units glPushMatrix(); //Save the transformations performed thus far glTranslatef(0.0f, -1.0f, 0.0f); //Move to the center of the trapezoid glRotatef(_angle, 0.0f, 0.0f, 1.0f); //Rotate about the z-axis glBegin(GL_QUADS); //Trapezoid glVertex3f(-0.7f, -0.5f, 0.0f); glVertex3f(0.7f, -0.5f, 0.0f); glVertex3f(0.4f, 0.5f, 0.0f); glVertex3f(-0.4f, 0.5f, 0.0f); glEnd(); glPopMatrix(); //Undo the move to the center of the trapezoid glPushMatrix(); //Save the current state of transformations glTranslatef(1.0f, 1.0f, 0.0f); //Move to the center of the pentagon glRotatef(_angle, 0.0f, 1.0f, 0.0f); //Rotate about the y-axis glScalef(0.7f, 0.7f, 0.7f); //Scale by 0.7 in the x, y, and z directions glBegin(GL_TRIANGLES); //Pentagon glVertex3f(-0.5f, -0.5f, 0.0f); glVertex3f(0.5f, -0.5f, 0.0f); glVertex3f(-0.5f, 0.0f, 0.0f); glVertex3f(-0.5f, 0.0f, 0.0f); glVertex3f(0.5f, -0.5f, 0.0f); glVertex3f(0.5f, 0.0f, 0.0f); glVertex3f(-0.5f, 0.0f, 0.0f); glVertex3f(0.5f, 0.0f, 0.0f); glVertex3f(0.0f, 0.5f, 0.0f); glEnd(); glPopMatrix(); //Undo the move to the center of the pentagon glPushMatrix(); //Save the current state of transformations glTranslatef(-1.0f, 1.0f, 0.0f); //Move to the center of the triangle glRotatef(_angle, 1.0f, 2.0f, 3.0f); //Rotate about the the vector (1, 2, 3) glBegin(GL_TRIANGLES); //Triangle glVertex3f(0.5f, -0.5f, 0.0f); glVertex3f(0.0f, 0.5f, 0.0f); glVertex3f(-0.5f, -0.5f, 0.0f); glEnd(); glPopMatrix(); //Undo the move to the center of the triangle glutSwapBuffers(); } void update(int value) { _angle += 2.0f; if (_angle 360) { _angle -= 260; } glutPostRedisplay(); //Tell GLUT that the display has changed //Tell GLUT to call update again in 25 milliseconds glutTimerFunc(25, update, 0); } int main(int argc, char** argv) { //Initialize GLUT glutInit(&argc, argv); glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB | GLUT_DEPTH); glutInitWindowSize(400, 400); //Create the window glutCreateWindow("Transformations and Timers - videotutorialsrock.com"); initRendering(); //Set handler functions glutDisplayFunc(drawScene); glutKeyboardFunc(handleKeypress); glutReshapeFunc(handleResize); glutTimerFunc(24, update, 0); //Add a timer glutMainLoop(); return 0; }

    Read the article

  • Why my VPN doesn't work anymore?

    - by xx77aBs
    I have openvpn server running on debian lenny. There is only one client - and it is running Windows 7 64-bit. This has worked for few months without any problems. And now, let's say for the last 7 days, it doesn't work at all. I connect successfully from client to the server, but I can't access anything through VPN. I have set it up so that all internet traffic is routed through VPN, and now when I connect with the client, the client can't do anything on the net (open any webpage, ping google, anything ...). Can you help me to figure out what's wrong ? I don't know where to start. I've also tried to connect to another openvpn server (I've installed and configured openvpn on another server, and when I try to connect to it result is the same). So I think there's something wrong with client ... Here is my connection log: Wed Apr 04 21:35:59 2012 OpenVPN 2.3-alpha1 Win32-MSVC++ [SSL (OpenSSL)] [LZO2] [PF_INET6] [IPv6 payload 20110522-1 (2.2.0)] built on Feb 21 2012 Enter Management Password: Wed Apr 04 21:35:59 2012 MANAGEMENT: TCP Socket listening on [AF_INET]127.0.0.10:25340 Wed Apr 04 21:35:59 2012 Need hold release from management interface, waiting... Wed Apr 04 21:36:00 2012 MANAGEMENT: Client connected from [AF_INET]127.0.0.10:25340 Wed Apr 04 21:36:00 2012 MANAGEMENT: CMD 'state on' Wed Apr 04 21:36:00 2012 MANAGEMENT: CMD 'log all on' Wed Apr 04 21:36:00 2012 MANAGEMENT: CMD 'hold off' Wed Apr 04 21:36:00 2012 MANAGEMENT: CMD 'hold release' Wed Apr 04 21:36:00 2012 WARNING: No server certificate verification method has been enabled. See http://openvpn.net/howto.html#mitm for more info. Wed Apr 04 21:36:00 2012 NOTE: OpenVPN 2.1 requires '--script-security 2' or higher to call user-defined scripts or executables Wed Apr 04 21:36:00 2012 Socket Buffers: R=[8192->8192] S=[8192->8192] Wed Apr 04 21:36:00 2012 MANAGEMENT: >STATE:1333568160,RESOLVE,,, Wed Apr 04 21:36:00 2012 UDPv4 link local: [undef] Wed Apr 04 21:36:00 2012 UDPv4 link remote: [AF_INET]11.22.33.44:1234 Wed Apr 04 21:36:00 2012 MANAGEMENT: >STATE:1333568160,WAIT,,, Wed Apr 04 21:36:00 2012 MANAGEMENT: >STATE:1333568160,AUTH,,, Wed Apr 04 21:36:00 2012 TLS: Initial packet from [AF_INET]11.22.33.44:1234, sid=ee329574 f15e9e04 Wed Apr 04 21:36:00 2012 VERIFY OK: depth=1, C=US, ST=CA, L=SanFrancisco, O=Fort-Funston, CN=Fort-Funston CA, [email protected] Wed Apr 04 21:36:00 2012 VERIFY OK: depth=0, C=US, ST=CA, L=SanFrancisco, O=Fort-Funston, CN=server_key, [email protected] Wed Apr 04 21:36:01 2012 Data Channel Encrypt: Cipher 'BF-CBC' initialized with 128 bit key Wed Apr 04 21:36:01 2012 Data Channel Encrypt: Using 160 bit message hash 'SHA1' for HMAC authentication Wed Apr 04 21:36:01 2012 Data Channel Decrypt: Cipher 'BF-CBC' initialized with 128 bit key Wed Apr 04 21:36:01 2012 Data Channel Decrypt: Using 160 bit message hash 'SHA1' for HMAC authentication Wed Apr 04 21:36:01 2012 Control Channel: TLSv1, cipher TLSv1/SSLv3 DHE-RSA-AES256-SHA, 1024 bit RSA Wed Apr 04 21:36:01 2012 [server_key] Peer Connection Initiated with [AF_INET]11.22.33.44:1234 Wed Apr 04 21:36:02 2012 MANAGEMENT: >STATE:1333568162,GET_CONFIG,,, Wed Apr 04 21:36:03 2012 SENT CONTROL [server_key]: 'PUSH_REQUEST' (status=1) Wed Apr 04 21:36:03 2012 PUSH: Received control message: 'PUSH_REPLY,redirect-gateway def1,route 172.16.100.1,topology net30,ping 10,ping-restart 120,ifconfig 172.16.100.6 172.16.100.5' Wed Apr 04 21:36:03 2012 OPTIONS IMPORT: timers and/or timeouts modified Wed Apr 04 21:36:03 2012 OPTIONS IMPORT: --ifconfig/up options modified Wed Apr 04 21:36:03 2012 OPTIONS IMPORT: route options modified Wed Apr 04 21:36:03 2012 ROUTE_GATEWAY 192.168.1.1/255.255.255.0 I=15 HWADDR=00:1f:1f:3f:61:55 Wed Apr 04 21:36:03 2012 do_ifconfig, tt->ipv6=0, tt->did_ifconfig_ipv6_setup=0 Wed Apr 04 21:36:03 2012 MANAGEMENT: >STATE:1333568163,ASSIGN_IP,,172.16.100.6, Wed Apr 04 21:36:03 2012 open_tun, tt->ipv6=0 Wed Apr 04 21:36:03 2012 TAP-WIN32 device [VPN] opened: \\.\Global\{E28FD52B-F6C3-4094-A36A-30CB02FAC7E8}.tap Wed Apr 04 21:36:03 2012 TAP-Win32 Driver Version 9.9 Wed Apr 04 21:36:03 2012 Notified TAP-Win32 driver to set a DHCP IP/netmask of 172.16.100.6/255.255.255.252 on interface {E28FD52B-F6C3-4094-A36A-30CB02FAC7E8} [DHCP-serv: 172.16.100.5, lease-time: 31536000] Wed Apr 04 21:36:03 2012 Successful ARP Flush on interface [31] {E28FD52B-F6C3-4094-A36A-30CB02FAC7E8} Wed Apr 04 21:36:08 2012 TEST ROUTES: 2/2 succeeded len=1 ret=1 a=0 u/d=up Wed Apr 04 21:36:08 2012 C:\Windows\system32\route.exe ADD 11.22.33.44 MASK 255.255.255.255 192.168.1.1 Wed Apr 04 21:36:08 2012 ROUTE: CreateIpForwardEntry succeeded with dwForwardMetric1=25 and dwForwardType=4 Wed Apr 04 21:36:08 2012 Route addition via IPAPI succeeded [adaptive] Wed Apr 04 21:36:08 2012 C:\Windows\system32\route.exe ADD 0.0.0.0 MASK 128.0.0.0 172.16.100.5 Wed Apr 04 21:36:08 2012 ROUTE: CreateIpForwardEntry succeeded with dwForwardMetric1=30 and dwForwardType=4 Wed Apr 04 21:36:08 2012 Route addition via IPAPI succeeded [adaptive] Wed Apr 04 21:36:08 2012 C:\Windows\system32\route.exe ADD 128.0.0.0 MASK 128.0.0.0 172.16.100.5 Wed Apr 04 21:36:08 2012 ROUTE: CreateIpForwardEntry succeeded with dwForwardMetric1=30 and dwForwardType=4 Wed Apr 04 21:36:08 2012 Route addition via IPAPI succeeded [adaptive] Wed Apr 04 21:36:08 2012 MANAGEMENT: >STATE:1333568168,ADD_ROUTES,,, Wed Apr 04 21:36:08 2012 C:\Windows\system32\route.exe ADD 172.16.100.1 MASK 255.255.255.255 172.16.100.5 Wed Apr 04 21:36:08 2012 ROUTE: CreateIpForwardEntry succeeded with dwForwardMetric1=30 and dwForwardType=4 Wed Apr 04 21:36:08 2012 Route addition via IPAPI succeeded [adaptive] Wed Apr 04 21:36:08 2012 Initialization Sequence Completed Wed Apr 04 21:36:08 2012 MANAGEMENT: >STATE:1333568168,CONNECTED,SUCCESS,172.16.100.6,11.22.33.44 Client's route table after connection with OpenVPN: IPv4 Route Table =========================================================================== Active Routes: Network Destination Netmask Gateway Interface Metric 0.0.0.0 0.0.0.0 192.168.1.1 192.168.1.41 281 0.0.0.0 128.0.0.0 172.16.100.1 172.16.100.6 31 94.23.53.45 255.255.255.255 192.168.1.1 192.168.1.41 25 127.0.0.0 255.0.0.0 On-link 127.0.0.1 306 127.0.0.1 255.255.255.255 On-link 127.0.0.1 306 127.255.255.255 255.255.255.255 On-link 127.0.0.1 306 128.0.0.0 128.0.0.0 172.16.100.1 172.16.100.6 31 172.16.100.4 255.255.255.252 On-link 172.16.100.6 286 172.16.100.6 255.255.255.255 On-link 172.16.100.6 286 172.16.100.7 255.255.255.255 On-link 172.16.100.6 286 192.168.1.0 255.255.255.0 On-link 192.168.1.41 281 192.168.1.41 255.255.255.255 On-link 192.168.1.41 281 192.168.1.255 255.255.255.255 On-link 192.168.1.41 281 224.0.0.0 240.0.0.0 On-link 127.0.0.1 306 224.0.0.0 240.0.0.0 On-link 192.168.1.41 281 224.0.0.0 240.0.0.0 On-link 172.16.100.6 286 255.255.255.255 255.255.255.255 On-link 127.0.0.1 306 255.255.255.255 255.255.255.255 On-link 192.168.1.41 281 255.255.255.255 255.255.255.255 On-link 172.16.100.6 286 =========================================================================== Persistent Routes: Network Address Netmask Gateway Address Metric 0.0.0.0 0.0.0.0 192.168.1.1 Default =========================================================================== IPv6 Route Table =========================================================================== Active Routes: If Metric Network Destination Gateway 13 58 ::/0 On-link 1 306 ::1/128 On-link 13 58 2001::/32 On-link 13 306 2001:0:5ef5:79fd:3cc3:6b9:ac7c:14db/128 On-link 15 281 fe80::/64 On-link 31 286 fe80::/64 On-link 13 306 fe80::/64 On-link 13 306 fe80::3cc3:6b9:ac7c:14db/128 On-link 31 286 fe80::7d72:9515:7213:35e3/128 On-link 15 281 fe80::9cec:ce3f:89de:a123/128 On-link 1 306 ff00::/8 On-link 13 306 ff00::/8 On-link 15 281 ff00::/8 On-link 31 286 ff00::/8 On-link =========================================================================== Persistent Routes: None

    Read the article

  • Make errors when compiling HPL-2.1 on MOSIX-clustered Debian server

    - by tlake
    I'm trying to compile HPL 2.1 on a MOSIX-clustered Debian server, but the make process terminates with errors as seen below. Included are my makefile and two versions of output: one from a standard execution, and one from an execution run with the debug flag. Any help and guidance would be very much appreciated! The makefile: # ---------------------------------------------------------------------- # - shell -------------------------------------------------------------- # ---------------------------------------------------------------------- # SHELL = /bin/bash # CD = cd CP = cp LN_S = ln -s MKDIR = mkdir RM = /bin/rm -f TOUCH = touch # # ---------------------------------------------------------------------- # - Platform identifier ------------------------------------------------ # ---------------------------------------------------------------------- # ARCH = Linux_PII_CBLAS # # ---------------------------------------------------------------------- # - HPL Directory Structure / HPL library ------------------------------ # ---------------------------------------------------------------------- # TOPdir = $(HOME)/hpl-2.1 INCdir = $(TOPdir)/include BINdir = $(TOPdir)/bin/$(ARCH) LIBdir = $(TOPdir)/lib/$(ARCH) # HPLlib = $(LIBdir)/libhpl.a # # ---------------------------------------------------------------------- # - Message Passing library (MPI) -------------------------------------- # ---------------------------------------------------------------------- # MPinc tells the C compiler where to find the Message Passing library # header files, MPlib is defined to be the name of the library to be # used. The variable MPdir is only used for defining MPinc and MPlib. # MPdir = /usr/local MPinc = -I$(MPdir)/include MPlib = $(MPdir)/lib/libmpi.so # # ---------------------------------------------------------------------- # - Linear Algebra library (BLAS or VSIPL) ----------------------------- # ---------------------------------------------------------------------- # LAinc tells the C compiler where to find the Linear Algebra library # header files, LAlib is defined to be the name of the library to be # used. The variable LAdir is only used for defining LAinc and LAlib. # LAdir = $(HOME)/CBLAS/lib LAinc = LAlib = $(LAdir)/cblas_LINUX.a # # ---------------------------------------------------------------------- # - F77 / C interface -------------------------------------------------- # ---------------------------------------------------------------------- # You can skip this section if and only if you are not planning to use # a BLAS library featuring a Fortran 77 interface. Otherwise, it is # necessary to fill out the F2CDEFS variable with the appropriate # options. **One and only one** option should be chosen in **each** of # the 3 following categories: # # 1) name space (How C calls a Fortran 77 routine) # # -DAdd_ : all lower case and a suffixed underscore (Suns, # Intel, ...), [default] # -DNoChange : all lower case (IBM RS6000), # -DUpCase : all upper case (Cray), # -DAdd__ : the FORTRAN compiler in use is f2c. # # 2) C and Fortran 77 integer mapping # # -DF77_INTEGER=int : Fortran 77 INTEGER is a C int, [default] # -DF77_INTEGER=long : Fortran 77 INTEGER is a C long, # -DF77_INTEGER=short : Fortran 77 INTEGER is a C short. # # 3) Fortran 77 string handling # # -DStringSunStyle : The string address is passed at the string loca- # tion on the stack, and the string length is then # passed as an F77_INTEGER after all explicit # stack arguments, [default] # -DStringStructPtr : The address of a structure is passed by a # Fortran 77 string, and the structure is of the # form: struct {char *cp; F77_INTEGER len;}, # -DStringStructVal : A structure is passed by value for each Fortran # 77 string, and the structure is of the form: # struct {char *cp; F77_INTEGER len;}, # -DStringCrayStyle : Special option for Cray machines, which uses # Cray fcd (fortran character descriptor) for # interoperation. # F2CDEFS = # # ---------------------------------------------------------------------- # - HPL includes / libraries / specifics ------------------------------- # ---------------------------------------------------------------------- # HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc) HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) # # - Compile time options ----------------------------------------------- # # -DHPL_COPY_L force the copy of the panel L before bcast; # -DHPL_CALL_CBLAS call the cblas interface; # -DHPL_CALL_VSIPL call the vsip library; # -DHPL_DETAILED_TIMING enable detailed timers; # # By default HPL will: # *) not copy L before broadcast, # *) call the BLAS Fortran 77 interface, # *) not display detailed timing information. # HPL_OPTS = -DHPL_CALL_CBLAS # # ---------------------------------------------------------------------- # HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES) # # ---------------------------------------------------------------------- # - Compilers / linkers - Optimization flags --------------------------- # ---------------------------------------------------------------------- # CC = /usr/bin/gcc CCNOOPT = $(HPL_DEFS) CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops # # On some platforms, it is necessary to use the Fortran linker to find # the Fortran internals used in the BLAS library. # LINKER = ~/BLAS LINKFLAGS = $(CCFLAGS) # ARCHIVER = ar ARFLAGS = r RANLIB = echo # # ---------------------------------------------------------------------- Make output: ~/BLAS -DHPL_CALL_CBLAS -I/homes/laket/hpl-2.1/include -I/homes/laket/hpl-2.1/include/Linux_PII_CBLAS -I/usr/local/include -fomit-frame-pointer -O3 -funroll-loops -o /homes/laket/hpl-2.1/bin/Linux_PII_CBLAS/xhpl HPL_pddriver.o HPL_pdinfo.o HPL_pdtest.o /homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a /homes/laket/CBLAS/lib/cblas_LINUX.a /usr/local/lib/libmpi.so /bin/bash: /homes/laket/BLAS: Is a directory make[2]: *** [dexe.grd] Error 126 make[2]: Target `all' not remade because of errors. make[2]: Leaving directory `/homes/laket/hpl-2.1/testing/ptest/Linux_PII_CBLAS' make[1]: *** [build_tst] Error 2 make[1]: Leaving directory `/homes/laket/hpl-2.1' make: *** [build] Error 2 make: Target `all' not remade because of errors. Make -d output: Considering target file `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a'. Looking for an implicit rule for `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a'. Trying pattern rule with stem `libhpl.a'. Trying implicit prerequisite `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a,v'. Trying pattern rule with stem `libhpl.a'. Trying implicit prerequisite `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/RCS/libhpl.a,v'. Trying pattern rule with stem `libhpl.a'. Trying implicit prerequisite `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/RCS/libhpl.a'. Trying pattern rule with stem `libhpl.a'. Trying implicit prerequisite `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/s.libhpl.a'. Trying pattern rule with stem `libhpl.a'. Trying implicit prerequisite `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/SCCS/s.libhpl.a'. No implicit rule found for `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a'. Finished prerequisites of target file `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a'. No need to remake target `/homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a'. Finished prerequisites of target file `dexe.grd'. Must remake target `dexe.grd'. ~/BLAS -DHPL_CALL_CBLAS -I/homes/laket/hpl-2.1/include -I/homes/laket/hpl-2.1/include/Linux_PII_CBLAS -I/usr/local/include -fomit-frame-pointer -O3 -funroll-loops -o /homes/laket/hpl-2.1/bin/Linux_PII_CBLAS/xhpl HPL_pddriver.o HPL_pdinfo.o HPL_pdtest.o /homes/laket/hpl-2.1/lib/Linux_PII_CBLAS/libhpl.a /homes/laket/CBLAS/lib/cblas_LINUX.a /usr/local/lib/libmpi.so Putting child 0x0129a2c0 (dexe.grd) PID 24853 on the chain. Live child 0x0129a2c0 (dexe.grd) PID 24853 /bin/bash: /homes/laket/BLAS: Is a directory make[2]: Reaping losing child 0x0129a2c0 PID 24853 *** [dexe.grd] Error 126 Removing child 0x0129a2c0 PID 24853 from chain. Failed to remake target file `dexe.grd'. Finished prerequisites of target file `dexe'. Giving up on target file `dexe'. Finished prerequisites of target file `all'. Giving up on target file `all'. make[2]: Target `all' not remade because of errors. make[2]: Leaving directory `/homes/laket/hpl-2.1/testing/ptest/Linux_PII_CBLAS' Reaping losing child 0x010ce900 PID 24841 make[1]: *** [build_tst] Error 2 Removing child 0x010ce900 PID 24841 from chain. Failed to remake target file `build_tst'. make[1]: Leaving directory `/homes/laket/hpl-2.1' Reaping losing child 0x00d91ae0 PID 24774 make: *** [build] Error 2 Removing child 0x00d91ae0 PID 24774 from chain. Failed to remake target file `build'. Finished prerequisites of target file `install'. make: Target `all' not remade because of errors. Giving up on target file `install'. Finished prerequisites of target file `all'. Giving up on target file `all'. Thanks!

    Read the article

  • My App crashes when launched on my Iphone

    - by Miky Mike
    hi guys, I have a problem here : my app crashed on my Iphone (JB) though Xcode doesn't complain about anything. The app works fine on the simulator though. However, there is this in the device logs : Thread 0 Crashed: 0 libSystem.B.dylib 0x00078ac8 kill + 8 1 libSystem.B.dylib 0x00078ab8 kill + 4 2 libSystem.B.dylib 0x00078aaa raise + 10 3 libSystem.B.dylib 0x0008d03a abort + 50 4 libstdc++.6.dylib 0x00044a20 __gnu_cxx::__verbose_terminate_handler() + 376 5 libobjc.A.dylib 0x00005958 _objc_terminate + 104 6 libstdc++.6.dylib 0x00042df2 _cxxabiv1::_terminate(void (*)()) + 46 7 libstdc++.6.dylib 0x00042e46 std::terminate() + 10 8 libstdc++.6.dylib 0x00042f16 __cxa_throw + 78 9 libobjc.A.dylib 0x00004838 objc_exception_throw + 64 10 CoreFoundation 0x0009fd0e +[NSException raise:format:arguments:] + 62 11 CoreFoundation 0x0009fd48 +[NSException raise:format:] + 28 12 Foundation 0x000125d8 -[NSURL(NSURL) initFileURLWithPath:] + 64 13 Foundation 0x000371e0 +[NSURL(NSURL) fileURLWithPath:] + 24 14 TheLearningMachine 0x00002d08 0x1000 + 7432 15 TheLearningMachine 0x00002e8c 0x1000 + 7820 16 TheLearningMachine 0x00002be4 0x1000 + 7140 17 TheLearningMachine 0x000029b6 0x1000 + 6582 18 UIKit 0x0000e47a -[UIApplication _callInitializationDelegatesForURL:payload:suspended:] + 766 19 UIKit 0x000049e0 -[UIApplication _runWithURL:payload:launchOrientation:statusBarStyle:statusBarHidden:] + 200 20 UIKit 0x0005dfd6 -[UIApplication handleEvent:withNewEvent:] + 1390 21 UIKit 0x0005d8fa -[UIApplication sendEvent:] + 38 22 UIKit 0x0005d330 _UIApplicationHandleEvent + 5104 23 GraphicsServices 0x00005044 PurpleEventCallback + 660 24 CoreFoundation 0x00034cdc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION + 20 25 CoreFoundation 0x00034ca0 __CFRunLoopDoSource1 + 160 26 CoreFoundation 0x00027566 __CFRunLoopRun + 514 27 CoreFoundation 0x00027270 CFRunLoopRunSpecific + 224 28 CoreFoundation 0x00027178 CFRunLoopRunInMode + 52 29 UIKit 0x000040fc -[UIApplication _run] + 364 30 UIKit 0x00002128 UIApplicationMain + 664 31 TheLearningMachine 0x00002948 0x1000 + 6472 32 TheLearningMachine 0x000028fc 0x1000 + 6396 Thread 1: 0 libSystem.B.dylib 0x0002d330 kevent + 24 1 libSystem.B.dylib 0x000d6b6c _dispatch_mgr_invoke + 88 2 libSystem.B.dylib 0x000d65bc _dispatch_queue_invoke + 96 3 libSystem.B.dylib 0x000d675c _dispatch_worker_thread2 + 120 4 libSystem.B.dylib 0x0007a67a _pthread_wqthread + 258 5 libSystem.B.dylib 0x00073190 start_wqthread + 0 Thread 2: 0 libSystem.B.dylib 0x0007b19c __workq_kernreturn + 8 1 libSystem.B.dylib 0x0007a790 _pthread_wqthread + 536 2 libSystem.B.dylib 0x00073190 start_wqthread + 0 Thread 3: 0 libSystem.B.dylib 0x00000c98 mach_msg_trap + 20 1 libSystem.B.dylib 0x00002d64 mach_msg + 44 2 CoreFoundation 0x00027c38 __CFRunLoopServiceMachPort + 88 3 CoreFoundation 0x000274c2 __CFRunLoopRun + 350 4 CoreFoundation 0x00027270 CFRunLoopRunSpecific + 224 5 CoreFoundation 0x00027178 CFRunLoopRunInMode + 52 6 WebCore 0x000024e2 RunWebThread(void*) + 362 7 libSystem.B.dylib 0x0007a27e _pthread_start + 242 8 libSystem.B.dylib 0x0006f2a8 thread_start + 0 Thread 0 crashed with ARM Thread State: r0: 0x00000000 r1: 0x00000000 r2: 0x00000001 r3: 0x3e0862b4 r4: 0x00000006 r5: 0x0015a2ec r6: 0x2fffe090 r7: 0x2fffe0a0 r8: 0x3e1a378c r9: 0x00000065 r10: 0x33028e5a r11: 0x3e1ab89c ip: 0x00000025 sp: 0x2fffe0a0 lr: 0x30277abf pc: 0x30277ac8 cpsr: 0x000f0010 Any idea what the problem can be ? I've already spent my whole day on that, but... I'm stuck. Thanks in advance... Miky Mike Ok, Here is more then from the console, I get this : This GDB was configured as "--host=i386-apple-darwin --target=arm-apple-darwin".tty /dev/ttys002 Loading program into debugger… Program loaded. target remote-mobile /tmp/.XcodeGDBRemote-17280-65 Switching to remote-macosx protocol mem 0x1000 0x3fffffff cache mem 0x40000000 0xffffffff none mem 0x00000000 0x0fff none run Running… Error launching remote program: failed to get the task for process 456. Error launching remote program: failed to get the task for process 456. The program being debugged is not being run. The program being debugged is not being run. [Session started at 2010-12-23 20:33:33 +0100.] GNU gdb 6.3.50-20050815 (Apple version gdb-1472) (Thu Aug 5 05:54:10 UTC 2010) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "--host=i386-apple-darwin --target=arm-apple-darwin".tty /dev/ttys004 Loading program into debugger… Program loaded. target remote-mobile /tmp/.XcodeGDBRemote-17280-72 Switching to remote-macosx protocol mem 0x1000 0x3fffffff cache mem 0x40000000 0xffffffff none mem 0x00000000 0x0fff none run Running… Error launching remote program: failed to get the task for process 508. Error launching remote program: failed to get the task for process 508. The program being debugged is not being run. The program being debugged is not being run. And here is the code page that calls the URL import "TheLearningMachineAppDelegate.h" import "RootViewController.h" @implementation TheLearningMachineAppDelegate @synthesize window; @synthesize navigationController; pragma mark - pragma mark Application lifecycle (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions { RootViewController *rootViewController = (RootViewController *)[navigationController topViewController]; rootViewController.managedObjectContext = self.managedObjectContext; [window addSubview:[navigationController view]]; [window makeKeyAndVisible]; return YES; } (void)applicationWillResignActive:(UIApplication )application { / Sent when the application is about to move from active to inactive state. This can occur for certain types of temporary interruptions (such as an incoming phone call or SMS message) or when the user quits the application and it begins the transition to the background state. Use this method to pause ongoing tasks, disable timers, and throttle down OpenGL ES frame rates. Games should use this method to pause the game. */ } (void)applicationDidEnterBackground:(UIApplication *)application { [self saveContext]; } (void)applicationWillEnterForeground:(UIApplication )application { / Called as part of the transition from the background to the inactive state: here you can undo many of the changes made on entering the background. */ } (void)applicationDidBecomeActive:(UIApplication )application { / Restart any tasks that were paused (or not yet started) while the application was inactive. If the application was previously in the background, optionally refresh the user interface. */ } // Method that saves the managed object context before the application terminates. (void)applicationWillTerminate:(UIApplication *)application { [self saveContext]; } (void)saveContext { NSError *error = nil; if (managedObjectContext != nil) { if ([managedObjectContext hasChanges] && ![managedObjectContext save:&error]) { NSLog(@"Unresolved error %@, %@", error, [error userInfo]); abort(); //Replace this implementation with code to handle the error appropriately. //abort() causes the application to generate a crash log and terminate. You should not use this function in a shipping application, although it may be useful during development. If it is not possible to recover from the error, display an alert panel that instructs the user to quit the application by pressing the Home button. } } } pragma mark - pragma mark Core Data stack // Returns the managed object context for the application. (NSManagedObjectContext *)managedObjectContext { if (managedObjectContext != nil) { return managedObjectContext; } NSPersistentStoreCoordinator *coordinator = [self persistentStoreCoordinator]; if (coordinator != nil) { managedObjectContext = [[NSManagedObjectContext alloc] init]; [managedObjectContext setPersistentStoreCoordinator:coordinator]; } return managedObjectContext; } // Returns the managed object model for the application. (NSManagedObjectModel *)managedObjectModel { if (managedObjectModel != nil) { return managedObjectModel; } NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"TheLearningMachine" ofType:@"momd"]; NSURL *modelURL = [NSURL fileURLWithPath:modelPath]; managedObjectModel = [[NSManagedObjectModel alloc] initWithContentsOfURL:modelURL]; return managedObjectModel; } pragma mark - pragma mark Application's Documents directory // Returns the path to the application's Documents directory. - (NSString *)applicationDocumentsDirectory { return [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject]; } // Returns the persistent store coordinator for the application. - (NSPersistentStoreCoordinator *)persistentStoreCoordinator { if (persistentStoreCoordinator != nil) { return persistentStoreCoordinator; } NSURL *storeURL = [NSURL fileURLWithPath: [[self applicationDocumentsDirectory] stringByAppendingPathComponent: @"TheLearningMachine.sqlite"]]; NSError *error = nil; persistentStoreCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:[self managedObjectModel]]; if (![persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:storeURL options:nil error:&error]) { NSLog(@"Unresolved error %@, %@", error, [error userInfo]); abort(); } return persistentStoreCoordinator; } pragma mark - pragma mark Memory management (void)applicationDidReceiveMemoryWarning:(UIApplication )application { / Free up as much memory as possible by purging cached data objects that can be recreated (or reloaded from disk) later. */ } (void)dealloc { [managedObjectContext release]; [managedObjectModel release]; [persistentStoreCoordinator release]; [window release]; [super dealloc]; } @end

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

< Previous Page | 7 8 9 10 11