Search Results

Search found 2565 results on 103 pages for 'reduce'.

Page 18/103 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25  | Next Page >

  • Web optimization

    - by hmloo
    1. CSS Optimization Organize your CSS code Good CSS organization helps with future maintainability of the site, it helps you and your team member understand the CSS more quickly and jump to specific styles. Structure CSS code For small project, you can break your CSS code in separate blocks according to the structure of the page or page content. for example you can break your CSS document according the content of your web page(e.g. Header, Main Content, Footer) Structure CSS file For large project, you may feel having too much CSS code in one place, so it's the best to structure your CSS into more CSS files, and use a master style sheet to import these style sheets. this solution can not only organize style structure, but also reduce server request./*--------------Master style sheet--------------*/ @import "Reset.css"; @import "Structure.css"; @import "Typography.css"; @import "Forms.css"; Create index for your CSS Another important thing is to create index at the beginning of your CSS file, index can help you quickly understand the whole CSS structure./*---------------------------------------- 1. Header 2. Navigation 3. Main Content 4. Sidebar 5. Footer ------------------------------------------*/ Writing efficient CSS selectors keep in mind that browsers match CSS selectors from right to left and the order of efficiency for selectors 1. id (#myid) 2. class (.myclass) 3. tag (div, h1, p) 4. adjacent sibling (h1 + p) 5. child (ul > li) 6. descendent (li a) 7. universal (*) 8. attribute (a[rel="external"]) 9. pseudo-class and pseudo element (a:hover, li:first) the rightmost selector is called "key selector", so when you write your CSS code, you should choose more efficient key selector. Here are some best practice: Don't tag-qualify Never do this:div#myid div.myclass .myclass#myid IDs are unique, classes are more unique than a tag so they don't need a tag. Doing so makes the selector less efficient. Avoid overqualifying selectors for example#nav a is more efficient thanul#nav li a Don't repeat declarationExample: body {font-size:12px;}h1 {font-size:12px;font-weight:bold;} since h1 is already inherited from body, so you don't need to repeate atrribute. Using 0 instead of 0px Always using #selector { margin: 0; } There’s no need to include the px after 0, removing all those superfluous px can reduce the size of your CSS file. Group declaration Example: h1 { font-size: 16pt; } h1 { color: #fff; } h1 { font-family: Arial, sans-serif; } it’s much better to combine them:h1 { font-size: 16pt; color: #fff; font-family: Arial, sans-serif; } Group selectorsExample: h1 { color: #fff; font-family: Arial, sans-serif; } h2 { color: #fff; font-family: Arial, sans-serif; } it would be much better if setup as:h1, h2 { color: #fff; font-family: Arial, sans-serif; } Group attributeExample: h1 { color: #fff; font-family: Arial, sans-serif; } h2 { color: #fff; font-family: Arial, sans-serif; font-size: 16pt; } you can set different rules for specific elements after setting a rule for a grouph1, h2 { color: #fff; font-family: Arial, sans-serif; } h2 { font-size: 16pt; } Using Shorthand PropertiesExample: #selector { margin-top: 8px; margin-right: 4px; margin-bottom: 8px; margin-left: 4px; }Better: #selector { margin: 8px 4px 8px 4px; }Best: #selector { margin: 8px 4px; } a good diagram illustrated how shorthand declarations are interpreted depending on how many values are specified for margin and padding property. instead of using:#selector { background-image: url(”logo.png”); background-position: top left; background-repeat: no-repeat; } is used:#selector { background: url(logo.png) no-repeat top left; } 2. Image Optimization Image Optimizer Image Optimizer is a free Visual Studio2010 extension that optimizes PNG, GIF and JPG file sizes without quality loss. It uses SmushIt and PunyPNG for the optimization. Just right click on any folder or images in Solution Explorer and choose optimize images, then it will automatically optimize all PNG, GIF and JPEG files in that folder. CSS Image Sprites CSS Image Sprites are a way to combine a collection of images to a single image, then use CSS background-position property to shift the visible area to show the required image, many images can take a long time to load and generates multiple server requests, so Image Sprite can reduce the number of server requests and improve site performance. You can use many online tools to generate your image sprite and CSS, and you can also try the Sprite and Image Optimization framework released by The ASP.NET team.

    Read the article

  • 10 Essential Tools for building ASP.NET Websites

    - by Stephen Walther
    I recently put together a simple public website created with ASP.NET for my company at Superexpert.com. I was surprised by the number of free tools that I ended up using to put together the website. Therefore, I thought it would be interesting to create a list of essential tools for building ASP.NET websites. These tools work equally well with both ASP.NET Web Forms and ASP.NET MVC. Performance Tools After reading Steve Souders two (very excellent) books on front-end website performance High Performance Web Sites and Even Faster Web Sites, I have been super sensitive to front-end website performance. According to Souders’ Performance Golden Rule: “Optimize front-end performance first, that's where 80% or more of the end-user response time is spent” You can use the tools below to reduce the size of the images, JavaScript files, and CSS files used by an ASP.NET application. 1. Sprite and Image Optimization Framework CSS sprites were first described in an article written for A List Apart entitled CSS sprites: Image Slicing’s Kiss of Death. When you use sprites, you combine multiple images used by a website into a single image. Next, you use CSS trickery to display particular sub-images from the combined image in a webpage. The primary advantage of sprites is that they reduce the number of requests required to display a webpage. Requesting a single large image is faster than requesting multiple small images. In general, the more resources – images, JavaScript files, CSS files – that must be moved across the wire, the slower your website. However, most people avoid using sprites because they require a lot of work. You need to combine all of the images and write just the right CSS rules to display the sub-images. The Microsoft Sprite and Image Optimization Framework enables you to avoid all of this work. The framework combines the images for you automatically. Furthermore, the framework includes an ASP.NET Web Forms control and an ASP.NET MVC helper that makes it easy to display the sub-images. You can download the Sprite and Image Optimization Framework from CodePlex at http://aspnet.codeplex.com/releases/view/50869. The Sprite and Image Optimization Framework was written by Morgan McClean who worked in the office next to mine at Microsoft. Morgan was a scary smart Intern from Canada and we discussed the Framework while he was building it (I was really excited to learn that he was working on it). Morgan added some great advanced features to this framework. For example, the Sprite and Image Optimization Framework supports something called image inlining. When you use image inlining, the actual image is stored in the CSS file. Here’s an example of what image inlining looks like: .Home_StephenWalther_small-jpg { width:75px; height:100px; background: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEsAAABkCAIAAABB1lpeAAAAB GdBTUEAALGOfPtRkwAAACBjSFJNAACHDwAAjA8AAP1SAACBQAAAfXkAAOmLAAA85QAAGcxzPIV3AAAKL s+zNfREAAAAASUVORK5CYII=) no-repeat 0% 0%; } The actual image (in this case a picture of me that is displayed on the home page of the Superexpert.com website) is stored in the CSS file. If you visit the Superexpert.com website then very few separate images are downloaded. For example, all of the images with a red border in the screenshot below take advantage of CSS sprites: Unfortunately, there are some significant Gotchas that you need to be aware of when using the Sprite and Image Optimization Framework. There are workarounds for these Gotchas. I plan to write about these Gotchas and workarounds in a future blog entry. 2. Microsoft Ajax Minifier Whenever possible you should combine, minify, compress, and cache with a far future header all of your JavaScript and CSS files. The Microsoft Ajax Minifier makes it easy to minify JavaScript and CSS files. Don’t confuse minification and compression. You need to do both. According to Souders, you can reduce the size of a JavaScript file by an additional 20% (on average) by minifying a JavaScript file after you compress the file. When you minify a JavaScript or CSS file, you use various tricks to reduce the size of the file before you compress the file. For example, you can minify a JavaScript file by replacing long JavaScript variables names with short variables names and removing unnecessary white space and comments. You can minify a CSS file by doing such things as replacing long color names such as #ffffff with shorter equivalents such as #fff. The Microsoft Ajax Minifier was created by Microsoft employee Ron Logan. Internally, this tool was being used by several large Microsoft websites. We also used the tool heavily on the ASP.NET team. I convinced Ron to publish the tool on CodePlex so that everyone in the world could take advantage of it. You can download the tool from the ASP.NET Ajax website and read documentation for the tool here. I created the installer for the Microsoft Ajax Minifier. When creating the installer, I also created a Visual Studio build task to make it easy to minify all of your JavaScript and CSS files whenever you do a build within Visual Studio automatically. Read the Ajax Minifier Quick Start to learn how to configure the build task. 3. ySlow The ySlow tool is a free add-on for Firefox created by Yahoo that enables you to test the front-end of your website. For example, here are the current test results for the Superexpert.com website: The Superexpert.com website has an overall score of B (not perfect but not bad). The ySlow tool is not perfect. For example, the Superexpert.com website received a failing grade of F for not using a Content Delivery Network even though the website using the Microsoft Ajax Content Delivery Network for JavaScript files such as jQuery. Uptime After publishing a website live to the world, you want to ensure that the website does not encounter any issues and that it stays live. I use the following tools to monitor the Superexpert.com website now that it is live. 4. ELMAH ELMAH stands for Error Logging Modules and Handlers for ASP.NET. ELMAH enables you to record any errors that happen at your website so you can review them in the future. You can download ELMAH for free from the ELMAH project website. ELMAH works great with both ASP.NET Web Forms and ASP.NET MVC. You can configure ELMAH to store errors in a number of different stores including XML files, the Event Log, an Access database, a SQL database, an Oracle database, or in computer RAM. You also can configure ELMAH to email error messages to you when they happen. By default, you can access ELMAH by requesting the elmah.axd page from a website with ELMAH installed. Here’s what the elmah page looks like from the Superexpert.com website (this page is password-protected because secret information can be revealed in an error message): If you click on a particular error message, you can view the original Yellow Screen ASP.NET error message (even when the error message was never displayed to the actual user). I installed ELMAH by taking advantage of the new package manager for ASP.NET named NuGet (originally named NuPack). You can read the details about NuGet in the following blog entry by Scott Guthrie. You can download NuGet from CodePlex. 5. Pingdom I use Pingdom to verify that the Superexpert.com website is always up. You can sign up for Pingdom by visiting Pingdom.com. You can use Pingdom to monitor a single website for free. At the Pingdom website, you configure the frequency that your website gets pinged. I verify that the Superexpert.com website is up every 5 minutes. I have the Pingdom service verify that it can retrieve the string “Contact Us” from the website homepage. If your website goes down, you can configure Pingdom so that it sends an email, Twitter, SMS, or iPhone alert. I use the Pingdom iPhone app which looks like this: 6. Host Tracker If your website does go down then you need some way of determining whether it is a problem with your local network or if your website is down for everyone. I use a website named Host-Tracker.com to check how badly a website is down. Here’s what the Host-Tracker website displays for the Superexpert.com website when the website can be successfully pinged from everywhere in the world: Notice that Host-Tracker pinged the Superexpert.com website from 68 locations including Roubaix, France and Scranton, PA. Debugging I mean debugging in the broadest possible sense. I use the following tools when building a website to verify that I have not made a mistake. 7. HTML Spell Checker Why doesn’t Visual Studio have a built-in spell checker? Don’t know – I’ve always found this mysterious. Fortunately, however, a former member of the ASP.NET team wrote a free spell checker that you can use with your ASP.NET pages. I find a spell checker indispensible. It is easy to delude yourself that you are capable of perfect spelling. I’m always super embarrassed when I actually run the spell checking tool and discover all of my spelling mistakes. The fastest way to add the HTML Spell Checker extension to Visual Studio is to select the menu option Tools, Extension Manager within Visual Studio. Click on Online Gallery and search for HTML Spell Checker: 8. IIS SEO Toolkit If people cannot find your website through Google then you should not even bother to create it. Microsoft has a great extension for IIS named the IIS Search Engine Optimization Toolkit that you can use to identify issue with your website that would hurt its page rank. You also can use this tool to quickly create a sitemap for your website that you can submit to Google or Bing. You can even generate the sitemap for an ASP.NET MVC website. Here’s what the report overview for the Superexpert.com website looks like: Notice that the Sueprexpert.com website had plenty of violations. For example, there are 65 cases in which a page has a broken hyperlink. You can drill into these violations to identity the exact page and location where these violations occur. 9. LinqPad If your ASP.NET website accesses a database then you should be using LINQ to Entities with the Entity Framework. Using LINQ involves some magic. LINQ queries written in C# get converted into SQL queries for you. If you are not careful about how you write your LINQ queries, you could unintentionally build a really badly performing website. LinqPad is a free tool that enables you to experiment with your LINQ queries. It even works with Microsoft SQL CE 4 and Azure. You can use LinqPad to execute a LINQ to Entities query and see the results. You also can use it to see the resulting SQL that gets executed against the database: 10. .NET Reflector I use .NET Reflector daily. The .NET Reflector tool enables you to take any assembly and disassemble the assembly into C# or VB.NET code. You can use .NET Reflector to see the “Source Code” of an assembly even when you do not have the actual source code. You can download a free version of .NET Reflector from the Redgate website. I use .NET Reflector primarily to help me understand what code is doing internally. For example, I used .NET Reflector with the Sprite and Image Optimization Framework to better understand how the MVC Image helper works. Here’s part of the disassembled code from the Image helper class: Summary In this blog entry, I’ve discussed several of the tools that I used to create the Superexpert.com website. These are tools that I use to improve the performance, improve the SEO, verify the uptime, or debug the Superexpert.com website. All of the tools discussed in this blog entry are free. Furthermore, all of these tools work with both ASP.NET Web Forms and ASP.NET MVC. Let me know if there are any tools that you use daily when building ASP.NET websites.

    Read the article

  • CLSF & CLK 2013 Trip Report by Jeff Liu

    - by jamesmorris
    This is a contributed post from Jeff Liu, lead XFS developer for the Oracle mainline Linux kernel team. Recently, I attended both the China Linux Storage and Filesystem workshop (CLSF), and the China Linux Kernel conference (CLK), which were held in Shanghai. Here are the highlights for both events. CLSF - 17th October XFS update (led by Jeff Liu) XFS keeps rapid progress with a lot of changes, especially focused on the infrastructure/performance improvements as well as  new feature development.  This can be reflected with a sample statistics among XFS/Ext4+JBD2/Btrfs via: # git diff --stat --minimal -C -M v3.7..v3.12-rc4 -- fs/xfs|fs/ext4+fs/jbd2|fs/btrfs XFS: 141 files changed, 27598 insertions(+), 19113 deletions(-) Ext4+JBD2: 39 files changed, 10487 insertions(+), 5454 deletions(-) Btrfs: 70 files changed, 19875 insertions(+), 8130 deletions(-) What made up those changes in XFS? Self-describing metadata(CRC32c). This is a new feature and it contributed about 70% code changes, it can be enabled via `mkfs.xfs -m crc=1 /dev/xxx` for v5 superblock. Transaction log space reservation improvements. With this change, we can calculate the log space reservation at mount time rather than runtime to reduce the the CPU overhead. User namespace support. So both XFS and USERNS can be enabled on kernel configuration begin from Linux 3.10. Thanks Dwight Engen's efforts for this thing. Split project/group quota inodes. Originally, project quota can not be enabled with group quota at the same time because they were share the same quota file inode, now it works but only for v5 super block. i.e, CRC enabled. CONFIG_XFS_WARN, an new lightweight runtime debugger which can be deployed in production environment. Readahead log object recovery, this change can speed up the log replay progress significantly. Speculative preallocation inode tracking, clearing and throttling. The main purpose is to deal with inodes with post-EOF space due to speculative preallocation, support improved quota management to free up a significant amount of unwritten space when at or near EDQUOT. It support backgroup scanning which occurs on a longish interval(5 mins by default, tunable), and on-demand scanning/trimming via ioctl(2). Bitter arguments ensued from this session, especially for the comparison between Ext4 and Btrfs in different areas, I have to spent a whole morning of the 1st day answering those questions. We basically agreed on XFS is the best choice in Linux nowadays because: Stable, XFS has a good record in stability in the past 10 years. Fengguang Wu who lead the 0-day kernel test project also said that he has observed less error than other filesystems in the past 1+ years, I own it to the XFS upstream code reviewer, they always performing serious code review as well as testing. Good performance for large/small files, XFS does not works very well for small files has already been an old story for years. Best choice (maybe) for distributed PB filesystems. e.g, Ceph recommends delopy OSD daemon on XFS because Ext4 has limited xattr size. Best choice for large storage (>16TB). Ext4 does not support a single file more than around 15.95TB. Scalability, any objection to XFS is best in this point? :) XFS is better to deal with transaction concurrency than Ext4, why? The maximum size of the log in XFS is 2038MB compare to 128MB in Ext4. Misc. Ext4 is widely used and it has been proved fast/stable in various loads and scenarios, XFS just need more customers, and Btrfs is still on the road to be a manhood. Ceph Introduction (Led by Li Wang) This a hot topic.  Li gave us a nice introduction about the design as well as their current works. Actually, Ceph client has been included in Linux kernel since 2.6.34 and supported by Openstack since Folsom but it seems that it has not yet been widely deployment in production environment. Their major work is focus on the inline data support to separate the metadata and data storage, reduce the file access time, i.e, a file access need communication twice, fetch the metadata from MDS and then get data from OSD, and also, the small file access is limited by the network latency. The solution is, for the small files they would like to store the data at metadata so that when accessing a small file, the metadata server can push both metadata and data to the client at the same time. In this way, they can reduce the overhead of calculating the data offset and save the communication to OSD. For this feature, they have only run some small scale testing but really saw noticeable improvements. Test environment: Intel 2 CPU 12 Core, 64GB RAM, Ubuntu 12.04, Ceph 0.56.6 with 200GB SATA disk, 15 OSD, 1 MDS, 1 MON. The sequence read performance for 1K size files improved about 50%. I have asked Li and Zheng Yan (the core developer of Ceph, who also worked on Btrfs) whether Ceph is really stable and can be deployed at production environment for large scale PB level storage, but they can not give a positive answer, looks Ceph even does not spread over Dreamhost (subject to confirmation). From Li, they only deployed Ceph for a small scale storage(32 nodes) although they'd like to try 6000 nodes in the future. Improve Linux swap for Flash storage (led by Shaohua Li) Because of high density, low power and low price, flash storage (SSD) is a good candidate to partially replace DRAM. A quick answer for this is using SSD as swap. But Linux swap is designed for slow hard disk storage, so there are a lot of challenges to efficiently use SSD for swap. SWAPOUT swap_map scan swap_map is the in-memory data structure to track swap disk usage, but it is a slow linear scan. It will become a bottleneck while finding many adjacent pages in the use of SSD. Shaohua Li have changed it to a cluster(128K) list, resulting in O(1) algorithm. However, this apporoach needs restrictive cluster alignment and only enabled for SSD. IO pattern In most cases, the swap io is in interleaved pattern because of mutiple reclaimers or a free cluster is shared by all reclaimers. Even though block layer can merge interleaved IO to some extent, but we cannot count on it completely. Hence the per-cpu cluster is added base on the previous change, it can help reclaimer do sequential IO and the block layer will be easier to merge IO. TLB flush: If we're reclaiming one active page, we should first move the page from active lru list to inactive lru list, and then reclaim the page from inactive lru to swap it out. During the process, we need to clear PTE twice: first is 'A'(ACCESS) bit, second is 'P'(PRESENT) bit. Processors need to send lots of ipi which make the TLB flush really expensive. Some works have been done to improve this, including rework smp_call_functiom_many() or remove the first TLB flush in x86, but there still have some arguments here and only parts of works have been pushed to mainline. SWAPIN: Page fault does iodepth=1 sync io, but it's a little waste if only issue a page size's IO. The obvious solution is doing swap readahead. But the current in-kernel swap readahead is arbitary(always 8 pages), and it always doesn't perform well for both random and sequential access workload. Shaohua introduced a new flag for madvise(MADV_WILLNEED) to do swap prefetch, so the changes happen in userspace API and leave the in-kernel readahead unchanged(but I think some improvement can also be done here). SWAP discard As we know, discard is important for SSD write throughout, but the current swap discard implementation is synchronous. He changed it to async discard which allow discard and write run in the same time. Meanwhile, the unit of discard is also optimized to cluster. Misc: lock contention For many concurrent swapout and swapin , the lock contention such as anon_vma or swap_lock is high, so he changed the swap_lock to a per-swap lock. But there still have some lock contention in very high speed SSD because of swapcache address_space lock. Zproject (led by Bob Liu) Bob gave us a very nice introduction about the current memory compression status. Now there are 3 projects(zswap/zram/zcache) which all aim at smooth swap IO storm and promote performance, but they all have their own pros and cons. ZSWAP It is implemented based on frontswap API and it uses a dynamic allocater named Zbud to allocate free pages. Zbud means pairs of zpages are "buddied" and it can only store at most two compressed pages in one page frame, so the max compress ratio is 50%. Each page frame is lru-linked and can do shink in memory pressure. If the compressed memory pool reach its limitation, shink or reclaim happens. It decompress the page frame into two new allocated pages and then write them to real swap device, but it can fail when allocating the two pages. ZRAM Acts as a compressed ramdisk and used as swap device, and it use zsmalloc as its allocator which has high density but may have fragmentation issues. Besides, page reclaim is hard since it will need more pages to uncompress and free just one page. ZRAM is preferred by embedded system which may not have any real swap device. Now both ZRAM and ZSWAP are in driver/staging tree, and in the mm community there are some disscussions of merging ZRAM into ZSWAP or viceversa, but no agreement yet. ZCACHE Handles file page compression but it is removed out of staging recently. From industry (led by Tang Jie, LSI) An LSI engineer introduced several new produces to us. The first is raid5/6 cards that it use full stripe writes to improve performance. The 2nd one he introduced is SandForce flash controller, who can understand data file types (data entropy) to reduce write amplification (WA) for nearly all writes. It's called DuraWrite and typical WA is 0.5. What's more, if enable its Dynamic Logical Capacity function module, the controller can do data compression which is transparent to upper layer. LSI testing shows that with this virtual capacity enables 1x TB drive can support up to 2x TB capacity, but the application must monitor free flash space to maintain optimal performance and to guard against free flash space exhaustion. He said the most useful application is for datebase. Another thing I think it's worth to mention is that a NV-DRAM memory in NMR/Raptor which is directly exposed to host system. Applications can directly access the NV-DRAM via a memory address - using standard system call mmap(). He said that it is very useful for database logging now. This kind of NVM produces are beginning to appear in recent years, and it is said that Samsung is building a research center in China for related produces. IMHO, NVM will bring an effect to current os layer especially on file system, e.g. its journaling may need to redesign to fully utilize these nonvolatile memory. OCFS2 (led by Canquan Shen) Without a doubt, HuaWei is the biggest contributor to OCFS2 in the past two years. They have posted 46 upstream patches and 39 patches have been merged. Their current project is based on 32/64 nodes cluster, but they also tried 128 nodes at the experimental stage. The major work they are working is to support ATS (atomic test and set), it can be works with DLM at the same time. Looks this idea is inspired by the vmware VMFS locking, i.e, http://blogs.vmware.com/vsphere/2012/05/vmfs-locking-uncovered.html CLK - 18th October 2013 Improving Linux Development with Better Tools (Andi Kleen) This talk focused on how to find/solve bugs along with the Linux complexity growing. Generally, we can do this with the following kind of tools: Static code checkers tools. e.g, sparse, smatch, coccinelle, clang checker, checkpatch, gcc -W/LTO, stanse. This can help check a lot of things, simple mistakes, complex problems, but the challenges are: some are very slow, false positives, may need a concentrated effort to get false positives down. Especially, no static checker I found can follow indirect calls (“OO in C”, common in kernel): struct foo_ops { int (*do_foo)(struct foo *obj); } foo->do_foo(foo); Dynamic runtime checkers, e.g, thread checkers, kmemcheck, lockdep. Ideally all kernel code would come with a test suite, then someone could run all the dynamic checkers. Fuzzers/test suites. e.g, Trinity is a great tool, it finds many bugs, but needs manual model for each syscall. Modern fuzzers around using automatic feedback, but notfor kernel yet: http://taviso.decsystem.org/making_software_dumber.pdf Debuggers/Tracers to understand code, e.g, ftrace, can dump on events/oops/custom triggers, but still too much overhead in many cases to run always during debug. Tools to read/understand source, e.g, grep/cscope work great for many cases, but do not understand indirect pointers (OO in C model used in kernel), give us all “do_foo” instances: struct foo_ops { int (*do_foo)(struct foo *obj); } = { .do_foo = my_foo }; foo>do_foo(foo); That would be great to have a cscope like tool that understands this based on types/initializers XFS: The High Performance Enterprise File System (Jeff Liu) [slides] I gave a talk for introducing the disk layout, unique features, as well as the recent changes.   The slides include some charts to reflect the performances between XFS/Btrfs/Ext4 for small files. About a dozen users raised their hands when I asking who has experienced with XFS. I remembered that when I asked the same question in LinuxCon/Japan, only 3 people raised their hands, but they are Chris Mason, Ric Wheeler, and another attendee. The attendee questions were mainly focused on stability, and comparison with other file systems. Linux Containers (Feng Gao) The speaker introduced us that the purpose for those kind of namespaces, include mount/UTS/IPC/Network/Pid/User, as well as the system API/ABI. For the userspace tools, He mainly focus on the Libvirt LXC rather than us(LXC). Libvirt LXC is another userspace container management tool, implemented as one type of libvirt driver, it can manage containers, create namespace, create private filesystem layout for container, Create devices for container and setup resources controller via cgroup. In this talk, Feng also mentioned another two possible new namespaces in the future, the 1st is the audit, but not sure if it should be assigned to user namespace or not. Another is about syslog, but the question is do we really need it? In-memory Compression (Bob Liu) Same as CLSF, a nice introduction that I have already mentioned above. Misc There were some other talks related to ACPI based memory hotplug, smart wake-affinity in scheduler etc., but my head is not big enough to record all those things. -- Jeff Liu

    Read the article

  • Sorting Algorithms

    - by MarkPearl
    General Every time I go back to university I find myself wading through sorting algorithms and their implementation in C++. Up to now I haven’t really appreciated their true value. However as I discovered this last week with Dictionaries in C# – having a knowledge of some basic programming principles can greatly improve the performance of a system and make one think twice about how to tackle a problem. I’m going to cover briefly in this post the following: Selection Sort Insertion Sort Shellsort Quicksort Mergesort Heapsort (not complete) Selection Sort Array based selection sort is a simple approach to sorting an unsorted array. Simply put, it repeats two basic steps to achieve a sorted collection. It starts with a collection of data and repeatedly parses it, each time sorting out one element and reducing the size of the next iteration of parsed data by one. So the first iteration would go something like this… Go through the entire array of data and find the lowest value Place the value at the front of the array The second iteration would go something like this… Go through the array from position two (position one has already been sorted with the smallest value) and find the next lowest value in the array. Place the value at the second position in the array This process would be completed until the entire array had been sorted. A positive about selection sort is that it does not make many item movements. In fact, in a worst case scenario every items is only moved once. Selection sort is however a comparison intensive sort. If you had 10 items in a collection, just to parse the collection you would have 10+9+8+7+6+5+4+3+2=54 comparisons to sort regardless of how sorted the collection was to start with. If you think about it, if you applied selection sort to a collection already sorted, you would still perform relatively the same number of iterations as if it was not sorted at all. Many of the following algorithms try and reduce the number of comparisons if the list is already sorted – leaving one with a best case and worst case scenario for comparisons. Likewise different approaches have different levels of item movement. Depending on what is more expensive, one may give priority to one approach compared to another based on what is more expensive, a comparison or a item move. Insertion Sort Insertion sort tries to reduce the number of key comparisons it performs compared to selection sort by not “doing anything” if things are sorted. Assume you had an collection of numbers in the following order… 10 18 25 30 23 17 45 35 There are 8 elements in the list. If we were to start at the front of the list – 10 18 25 & 30 are already sorted. Element 5 (23) however is smaller than element 4 (30) and so needs to be repositioned. We do this by copying the value at element 5 to a temporary holder, and then begin shifting the elements before it up one. So… Element 5 would be copied to a temporary holder 10 18 25 30 23 17 45 35 – T 23 Element 4 would shift to Element 5 10 18 25 30 30 17 45 35 – T 23 Element 3 would shift to Element 4 10 18 25 25 30 17 45 35 – T 23 Element 2 (18) is smaller than the temporary holder so we put the temporary holder value into Element 3. 10 18 23 25 30 17 45 35 – T 23   We now have a sorted list up to element 6. And so we would repeat the same process by moving element 6 to a temporary value and then shifting everything up by one from element 2 to element 5. As you can see, one major setback for this technique is the shifting values up one – this is because up to now we have been considering the collection to be an array. If however the collection was a linked list, we would not need to shift values up, but merely remove the link from the unsorted value and “reinsert” it in a sorted position. Which would reduce the number of transactions performed on the collection. So.. Insertion sort seems to perform better than selection sort – however an implementation is slightly more complicated. This is typical with most sorting algorithms – generally, greater performance leads to greater complexity. Also, insertion sort performs better if a collection of data is already sorted. If for instance you were handed a sorted collection of size n, then only n number of comparisons would need to be performed to verify that it is sorted. It’s important to note that insertion sort (array based) performs a number item moves – every time an item is “out of place” several items before it get shifted up. Shellsort – Diminishing Increment Sort So up to now we have covered Selection Sort & Insertion Sort. Selection Sort makes many comparisons and insertion sort (with an array) has the potential of making many item movements. Shellsort is an approach that takes the normal insertion sort and tries to reduce the number of item movements. In Shellsort, elements in a collection are viewed as sub-collections of a particular size. Each sub-collection is sorted so that the elements that are far apart move closer to their final position. Suppose we had a collection of 15 elements… 10 20 15 45 36 48 7 60 18 50 2 19 43 30 55 First we may view the collection as 7 sub-collections and sort each sublist, lets say at intervals of 7 10 60 55 – 20 18 – 15 50 – 45 2 – 36 19 – 48 43 – 7 30 10 55 60 – 18 20 – 15 50 – 2 45 – 19 36 – 43 48 – 7 30 (Sorted) We then sort each sublist at a smaller inter – lets say 4 10 55 60 18 – 20 15 50 2 – 45 19 36 43 – 48 7 30 10 18 55 60 – 2 15 20 50 – 19 36 43 45 – 7 30 48 (Sorted) We then sort elements at a distance of 1 (i.e. we apply a normal insertion sort) 10 18 55 60 2 15 20 50 19 36 43 45 7 30 48 2 7 10 15 18 19 20 30 36 43 45 48 50 55 (Sorted) The important thing with shellsort is deciding on the increment sequence of each sub-collection. From what I can tell, there isn’t any definitive method and depending on the order of your elements, different increment sequences may perform better than others. There are however certain increment sequences that you may want to avoid. An even based increment sequence (e.g. 2 4 8 16 32 …) should typically be avoided because it does not allow for even elements to be compared with odd elements until the final sort phase – which in a way would negate many of the benefits of using sub-collections. The performance on the number of comparisons and item movements of Shellsort is hard to determine, however it is considered to be considerably better than the normal insertion sort. Quicksort Quicksort uses a divide and conquer approach to sort a collection of items. The collection is divided into two sub-collections – and the two sub-collections are sorted and combined into one list in such a way that the combined list is sorted. The algorithm is in general pseudo code below… Divide the collection into two sub-collections Quicksort the lower sub-collection Quicksort the upper sub-collection Combine the lower & upper sub-collection together As hinted at above, quicksort uses recursion in its implementation. The real trick with quicksort is to get the lower and upper sub-collections to be of equal size. The size of a sub-collection is determined by what value the pivot is. Once a pivot is determined, one would partition to sub-collections and then repeat the process on each sub collection until you reach the base case. With quicksort, the work is done when dividing the sub-collections into lower & upper collections. The actual combining of the lower & upper sub-collections at the end is relatively simple since every element in the lower sub-collection is smaller than the smallest element in the upper sub-collection. Mergesort With quicksort, the average-case complexity was O(nlog2n) however the worst case complexity was still O(N*N). Mergesort improves on quicksort by always having a complexity of O(nlog2n) regardless of the best or worst case. So how does it do this? Mergesort makes use of the divide and conquer approach to partition a collection into two sub-collections. It then sorts each sub-collection and combines the sorted sub-collections into one sorted collection. The general algorithm for mergesort is as follows… Divide the collection into two sub-collections Mergesort the first sub-collection Mergesort the second sub-collection Merge the first sub-collection and the second sub-collection As you can see.. it still pretty much looks like quicksort – so lets see where it differs… Firstly, mergesort differs from quicksort in how it partitions the sub-collections. Instead of having a pivot – merge sort partitions each sub-collection based on size so that the first and second sub-collection of relatively the same size. This dividing keeps getting repeated until the sub-collections are the size of a single element. If a sub-collection is one element in size – it is now sorted! So the trick is how do we put all these sub-collections together so that they maintain their sorted order. Sorted sub-collections are merged into a sorted collection by comparing the elements of the sub-collection and then adjusting the sorted collection. Lets have a look at a few examples… Assume 2 sub-collections with 1 element each 10 & 20 Compare the first element of the first sub-collection with the first element of the second sub-collection. Take the smallest of the two and place it as the first element in the sorted collection. In this scenario 10 is smaller than 20 so 10 is taken from sub-collection 1 leaving that sub-collection empty, which means by default the next smallest element is in sub-collection 2 (20). So the sorted collection would be 10 20 Lets assume 2 sub-collections with 2 elements each 10 20 & 15 19 So… again we would Compare 10 with 15 – 10 is the winner so we add it to our sorted collection (10) leaving us with 20 & 15 19 Compare 20 with 15 – 15 is the winner so we add it to our sorted collection (10 15) leaving us with 20 & 19 Compare 20 with 19 – 19 is the winner so we add it to our sorted collection (10 15 19) leaving us with 20 & _ 20 is by default the winner so our sorted collection is 10 15 19 20. Make sense? Heapsort (still needs to be completed) So by now I am tired of sorting algorithms and trying to remember why they were so important. I think every year I go through this stuff I wonder to myself why are we made to learn about selection sort and insertion sort if they are so bad – why didn’t we just skip to Mergesort & Quicksort. I guess the only explanation I have for this is that sometimes you learn things so that you can implement them in future – and other times you learn things so that you know it isn’t the best way of implementing things and that you don’t need to implement it in future. Anyhow… luckily this is going to be the last one of my sorts for today. The first step in heapsort is to convert a collection of data into a heap. After the data is converted into a heap, sorting begins… So what is the definition of a heap? If we have to convert a collection of data into a heap, how do we know when it is a heap and when it is not? The definition of a heap is as follows: A heap is a list in which each element contains a key, such that the key in the element at position k in the list is at least as large as the key in the element at position 2k +1 (if it exists) and 2k + 2 (if it exists). Does that make sense? At first glance I’m thinking what the heck??? But then after re-reading my notes I see that we are doing something different – up to now we have really looked at data as an array or sequential collection of data that we need to sort – a heap represents data in a slightly different way – although the data is stored in a sequential collection, for a sequential collection of data to be in a valid heap – it is “semi sorted”. Let me try and explain a bit further with an example… Example 1 of Potential Heap Data Assume we had a collection of numbers as follows 1[1] 2[2] 3[3] 4[4] 5[5] 6[6] For this to be a valid heap element with value of 1 at position [1] needs to be greater or equal to the element at position [3] (2k +1) and position [4] (2k +2). So in the above example, the collection of numbers is not in a valid heap. Example 2 of Potential Heap Data Lets look at another collection of numbers as follows 6[1] 5[2] 4[3] 3[4] 2[5] 1[6] Is this a valid heap? Well… element with the value 6 at position 1 must be greater or equal to the element at position [3] and position [4]. Is 6 > 4 and 6 > 3? Yes it is. Lets look at element 5 as position 2. It must be greater than the values at [4] & [5]. Is 5 > 3 and 5 > 2? Yes it is. If you continued to examine this second collection of data you would find that it is in a valid heap based on the definition of a heap.

    Read the article

  • Problems with Castle DynamicProxy2 on .Net 3.5 SP1 on Win2003 Server

    - by Andrea Balducci
    I've an mvc + nh asp.net application. On my dev machine (win 7 Ent) all works fine, if deployed on a Win 2k3 (tried 2 different vm and one phisical machine) I got the following error.. anyone can help? Cannot explain this issue (tried the same build, so i think it'a machine configuration issue).. Derived method 'set_ID' in type 'CustomerProxy75950979a2a048e889584c21696f7f1b' from assembly 'DynamicProxyGenAssembly2, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' cannot reduce access [TypeLoadException: Derived method 'set_ID' in type 'CustomerProxy75950979a2a048e889584c21696f7f1b' from assembly 'DynamicProxyGenAssembly2, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' cannot reduce access.] System.Reflection.Emit.TypeBuilder._TermCreateClass(Int32 handle, Module module) +0 System.Reflection.Emit.TypeBuilder.CreateTypeNoLock() +915 System.Reflection.Emit.TypeBuilder.CreateType() +108 Castle.DynamicProxy.Generators.Emitters.AbstractTypeEmitter.BuildType() +48 Castle.DynamicProxy.Generators.ClassProxyGenerator.GenerateCode(Type[] interfaces, ProxyGenerationOptions options) +3821 Castle.DynamicProxy.DefaultProxyBuilder.CreateClassProxy(Type classToProxy, Type[] additionalInterfacesToProxy, ProxyGenerationOptions options) +84 Castle.DynamicProxy.ProxyGenerator.CreateClassProxy(Type classToProxy, Type[] additionalInterfacesToProxy, ProxyGenerationOptions options, Object[] constructorArguments, IInterceptor[] interceptors) +92 Castle.DynamicProxy.ProxyGenerator.CreateClassProxy(Type classToProxy, Type[] additionalInterfacesToProxy, IInterceptor[] interceptors) +21 NHibernate.ByteCode.Castle.ProxyFactory.GetProxy(Object id, ISessionImplementor session) +283

    Read the article

  • why am i getting error in this switch statement written in c

    - by mekasperasky
    I have a character array b which stores different identifiers in different iterations . I have to compare b with various identifiers of the programming language C and print it into a file . When i do it using the following switch statement it gives me errors b[i]='\0'; switch(b[i]) { case "if":fprintf(fp2,"if ----> IDENTIFIER \n"); case "then":fprintf(fp2,"then ----> IDENTIFIER \n"); case "else":fprintf(fp2,"else ----> IDENTIFIER \n"); case "switch":fprintf(fp2,"switch ----> IDENTIFIER \n"); case 'printf':fprintf(fp2,"prtintf ----> IDENTIFIER \n"); case 'scanf':fprintf(fp2,"else ----> IDENTIFIER \n"); case 'NULL':fprintf(fp2,"NULL ----> IDENTIFIER \n"); case 'int':fprintf(fp2,"INT ----> IDENTIFIER \n"); case 'char':fprintf(fp2,"char ----> IDENTIFIER \n"); case 'float':fprintf(fp2,"float ----> IDENTIFIER \n"); case 'long':fprintf(fp2,"long ----> IDENTIFIER \n"); case 'double':fprintf(fp2,"double ----> IDENTIFIER \n"); case 'char':fprintf(fp2,"char ----> IDENTIFIER \n"); case 'const':fprintf(fp2,"const ----> IDENTIFIER \n"); case 'continue':fprintf(fp2,"continue ----> IDENTIFIER \n"); case 'break':fprintf(fp2,"long ----> IDENTIFIER \n"); case 'for':fprintf(fp2,"long ----> IDENTIFIER \n"); case 'size of':fprintf(fp2,"size of ----> IDENTIFIER \n"); case 'register':fprintf(fp2,"register ----> IDENTIFIER \n"); case 'short':fprintf(fp2,"short ----> IDENTIFIER \n"); case 'auto':fprintf(fp2,"auto ----> IDENTIFIER \n"); case 'while':fprintf(fp2,"while ----> IDENTIFIER \n"); case 'do':fprintf(fp2,"do ----> IDENTIFIER \n"); case 'case':fprintf(fp2,"case ----> IDENTIFIER \n"); } the error being lex.c:94:13: warning: character constant too long for its type lex.c:95:13: warning: character constant too long for its type lex.c:96:13: warning: multi-character character constant lex.c:97:13: warning: multi-character character constant lex.c:98:13: warning: multi-character character constant lex.c:99:13: warning: character constant too long for its type lex.c:100:13: warning: multi-character character constant lex.c:101:13: warning: character constant too long for its type lex.c:102:13: warning: multi-character character constant lex.c:103:13: warning: character constant too long for its type lex.c:104:13: warning: character constant too long for its type lex.c:105:13: warning: character constant too long for its type lex.c:106:13: warning: multi-character character constant lex.c:107:13: warning: character constant too long for its type lex.c:108:13: warning: character constant too long for its type lex.c:109:13: warning: character constant too long for its type lex.c:110:12: warning: multi-character character constant lex.c:111:13: warning: character constant too long for its type lex.c:112:13: warning: multi-character character constant lex.c:113:13: warning: multi-character character constant lex.c: In function ‘int main()’: lex.c:90: error: case label does not reduce to an integer constant lex.c:91: error: case label does not reduce to an integer constant lex.c:92: error: case label does not reduce to an integer constant lex.c:93: error: case label does not reduce to an integer constant lex.c:94: warning: overflow in implicit constant conversion lex.c:95: warning: overflow in implicit constant conversion lex.c:95: error: duplicate case value lex.c:94: error: previously used here lex.c:96: warning: overflow in implicit constant conversion lex.c:97: warning: overflow in implicit constant conversion lex.c:98: warning: overflow in implicit constant conversion lex.c:99: warning: overflow in implicit constant conversion lex.c:99: error: duplicate case value lex.c:97: error: previously used here lex.c:100: warning: overflow in implicit constant conversion lex.c:101: warning: overflow in implicit constant conversion lex.c:102: warning: overflow in implicit constant conversion lex.c:102: error: duplicate case value lex.c:98: error: previously used here lex.c:103: warning: overflow in implicit constant conversion lex.c:103: error: duplicate case value lex.c:97: error: previously used here lex.c:104: warning: overflow in implicit constant conversion lex.c:104: error: duplicate case value lex.c:101: error: previously used here lex.c:105: warning: overflow in implicit constant conversion lex.c:106: warning: overflow in implicit constant conversion lex.c:106: error: duplicate case value lex.c:98: error: previously used here lex.c:107: warning: overflow in implicit constant conversion lex.c:107: error: duplicate case value lex.c:94: error: previously used here lex.c:108: warning: overflow in implicit constant conversion lex.c:108: error: duplicate case value lex.c:98: error: previously used here lex.c:109: warning: overflow in implicit constant conversion lex.c:109: error: duplicate case value lex.c:97: error: previously used here lex.c:110: warning: overflow in implicit constant conversion lex.c:111: warning: overflow in implicit constant conversion lex.c:111: error: duplicate case value lex.c:101: error: previously used here lex.c:112: warning: overflow in implicit constant conversion lex.c:112: error: duplicate case value lex.c:110: error: previously used here lex.c:113: warning: overflow in implicit constant conversion lex.c:113: error: duplicate case value lex.c:101: error: previously used here

    Read the article

  • I thought this parsing would be simple...

    - by Rebol Tutorial
    ... and I'm hitting the wall, I don't understand why this doesn't work (I need to be able to parse either the single tag version (terminated with /) or the 2 tag versions (terminated with ) ): Rebol[] content: {<pre:myTag attr1="helloworld" attr2="hello"/> <pre:myTag attr1="helloworld" attr2="hello"> </pre:myTag> <pre:myTag attr3="helloworld" attr4="hello"/> } spacer: charset reduce [#" " newline] letter: charset reduce ["ABCDEFGHIJKLMNOPQRSTUabcdefghijklmnopqrstuvwxyz1234567890="] rule: [ any [ {<pre:myTag} any [any letter {"} any letter {"}] mark: (print {clipboard... after any letter {"} any letter {"}} write clipboard:// mark input) any spacer mark: (print "clipboard..." write clipboard:// mark input) ["/>" | ">" any spacer </pre:myTag> ] any spacer (insert mark { Visible="false"}) ] to end ] parse content rule write clipboard:// content print "The end" input

    Read the article

  • RSA encryption results in server execution timeout

    - by Nilambari
    Hi, I am using PHP Crypt_RSA (http://pear.php.net/package/Crypt_RSA) for encrypting and decrypting the contents. Contents are of 1kb size. Following are the results: keylength = 1024 Encryption function takes time: 225 secs keylength = 2048 Encryption function takes time: 115 secs I need to reduce this execution time as most of the live apache servers have 120 sec limit for execution time. How to reduce this execution time? RSA alorithm docs says the only 1024 - 2048 keys are generated. I ACTUALLY tried to generate larger key, but it always results in execution timeout. How do i work on reducing encryption - decryption execution time? Thanks, Nila

    Read the article

  • As our favorite imperative languages gain functional constructs, should loops be considered a code s

    - by Michael Buen
    In allusion to Dare Obasanjo's impressions on Map, Reduce, Filter (Functional Programming in C# 3.0: How Map/Reduce/Filter can Rock your World) "With these three building blocks, you could replace the majority of the procedural for loops in your application with a single line of code. C# 3.0 doesn't just stop there." Should we increasingly use them instead of loops? And should be having loops(instead of those three building blocks of data manipulation) be one of the metrics for coding horrors on code reviews? And why? [NOTE] I'm not advocating fully functional programming on those codes that could be simply translated to loops(e.g. tail recursions) Asking for politer term. Considering that the phrase "code smell" is not so diplomatic, I posted another question http://stackoverflow.com/questions/432492/whats-the-politer-word-for-code-smell about the right word for "code smell", er.. utterly bad code. Should that phrase have a place in our programming parlance?

    Read the article

  • Assert parameters in a table-valued UDF

    - by Clay Lenhart
    Is there a way to create "asserts" on the parameters of a table-valued UDF. I'd like to use a table-valued UDF for performance reasons, however I know that certain parameter combinations (like start and end dates that are more than a month apart) will cause performance issues on the server for all users. End users query the database via Excel using UDFs. UDFs (and table-valued UDFs in particular) are useful when the data is too large for Excel. Users write simple SQL queries that categorizes the data into groups to reduce the number of rows. For example, the user may be interested in weekly aggregates rather than hourly ones. Users write a group by SELECT statement to reduce the rows by 24x7=168 times. I know I can write RAISERROR statements in multistatement UDFs, but table-valued UDFs are integrated in the query optimizer so these queries are more efficient with table-valued UDFs. So, can I define assertions on the parameters passed to a table-valued UDF?

    Read the article

  • Apply limit in mapreduce function in php?

    - by Rohan Kumar
    How to apply limit in php, mongodb when using mapreduce function? I tried this $cmd=array(// codition array "mapreduce" => "user", "map" => $map, "reduce" => $reduce, "out" => array("inline" => 1), "limit"=>2 ); $db=connect(); $query = $db->command($cmd);// run command But its not working it gives 2 documents.I can't use limit on sub documents. If I have 100's of sub documents and then I want paging in sub documents.Then it fails.Is it possible to apply limit on sub documents?

    Read the article

  • how to show all method and data when the object not has "__iter__" function in python..

    - by zjm1126
    i find a way : (1):the dir(object) is : a="['__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__doc__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__iter__', '__metaclass__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__', '__weakref__', '_errors', '_fields', '_prefix', '_unbound_fields', 'confirm', 'data', 'email', 'errors', 'password', 'populate_obj', 'process', 'username', 'validate']" (2): b=eval(a) (3)and it became a list of all method : ['__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__doc__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__iter__', '__metaclass__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__str__', '__weakref__', '_errors', '_fields', '_prefix', '_unbound_fields', 'confirm', 'data', 'email', 'errors', 'password', 'populate_obj', 'process', 'username', 'validate'] (3)then show the object's method,and all code is : s='' a=eval(str(dir(object))) for i in a: s+=str(i)+':'+str(object[i]) print s but it show error : KeyError: '__class__' so how to make my code running . thanks

    Read the article

  • Chaning coding style due to Android GC performance, how far is too far?

    - by Benju
    I keep hearing that Android applications should try to limit the number of objects created in order to reduce the workload on the garbage collector. It makes sense that you may not want to created massive numbers of objects to track on a limited memory footprint, for example on a traditional server application created 100,000 objects within a few seconds would not be unheard of. The problem is how far should I take this? I've seen tons of examples of Android applications relying on static state in order supposedly "speed things up". Does increasing the number of instances that need to be garbage collected from dozens to hundreds really make that big of a difference? I can imagine changing my coding style to now created hundreds of thousands of objects like you might have on a full-blown Java-EE server but relying on a bunch of static state to (supposedly) reduce the number of objects to be garbage collected seems odd. How much is it really necessary to change your coding style in order to create performance Android apps?

    Read the article

  • Reducer getting fewer records than expected

    - by sathishs
    We have a scenario of generating unique key for every single row in a file. we have a timestamp column but the are multiple rows available for a same timestamp in few scenarios. We decided unique values to be timestamp appended with their respective count as mentioned in the below program. Mapper will just emit the timestamp as key and the entire row as its value, and in reducer the key is generated. Problem is Map outputs about 236 rows, of which only 230 records are fed as an input for reducer which outputs the same 230 records. public class UniqueKeyGenerator extends Configured implements Tool { private static final String SEPERATOR = "\t"; private static final int TIME_INDEX = 10; private static final String COUNT_FORMAT_DIGITS = "%010d"; public static class Map extends Mapper<LongWritable, Text, Text, Text> { @Override protected void map(LongWritable key, Text row, Context context) throws IOException, InterruptedException { String input = row.toString(); String[] vals = input.split(SEPERATOR); if (vals != null && vals.length >= TIME_INDEX) { context.write(new Text(vals[TIME_INDEX - 1]), row); } } } public static class Reduce extends Reducer<Text, Text, NullWritable, Text> { @Override protected void reduce(Text eventTimeKey, Iterable<Text> timeGroupedRows, Context context) throws IOException, InterruptedException { int cnt = 1; final String eventTime = eventTimeKey.toString(); for (Text val : timeGroupedRows) { final String res = SEPERATOR.concat(getDate( Long.valueOf(eventTime)).concat( String.format(COUNT_FORMAT_DIGITS, cnt))); val.append(res.getBytes(), 0, res.length()); cnt++; context.write(NullWritable.get(), val); } } } public static String getDate(long time) { SimpleDateFormat utcSdf = new SimpleDateFormat("yyyyMMddhhmmss"); utcSdf.setTimeZone(TimeZone.getTimeZone("America/Los_Angeles")); return utcSdf.format(new Date(time)); } public int run(String[] args) throws Exception { conf(args); return 0; } public static void main(String[] args) throws Exception { conf(args); } private static void conf(String[] args) throws IOException, InterruptedException, ClassNotFoundException { Configuration conf = new Configuration(); Job job = new Job(conf, "uniquekeygen"); job.setJarByClass(UniqueKeyGenerator.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); // job.setNumReduceTasks(400); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } } It is consistent for higher no of lines and the difference is as huge as 208969 records for an input of 20855982 lines. what might be the reason for reduced inputs to reducer?

    Read the article

  • Problem with Clojure function

    - by Bozhidar Batsov
    Hi, everyone, I've started working yesterday on the Euler Project in Clojure and I have a problem with one of my solutions I cannot figure out. I have this function: (defn find-max-palindrom-in-range [beg end] (reduce max (loop [n beg result []] (if (>= n end) result (recur (inc n) (concat result (filter #(is-palindrom? %) (map #(* n %) (range beg end))))))))) I try to run it like this: (find-max-palindrom-in-range 100 1000) and I get this exception: java.lang.Integer cannot be cast to clojure.lang.IFn [Thrown class java.lang.ClassCastException] which I presume means that at some place I'm trying to evaluate an Integer as a function. I however cannot find this place and what puzzles me more is that everything works if I simply evaluate it like this: (reduce max (loop [n 100 result []] (if (>= n 1000) result (recur (inc n) (concat result (filter #(is-palindrom? %) (map #(* n %) (range 100 1000)))))))) (I've just stripped down the function definition and replaced the parameters with constants) Thanks in advance for your help and sorry that I probably bother you with idiotic mistake on my part. Btw I'm using Clojure 1.1 and the newest SLIME from ELPA.

    Read the article

  • Remove adjacent identical elements in a Ruby Array?

    - by Mike Woodhouse
    Ruby 1.8.6 I have an array containing numerical values. I want to reduce it such that sequences of the same value are reduced to a single instance of that value. So I want a = [1, 1, 1, 2, 2, 3, 3, 3, 3, 2, 2, 2, 3, 3, 3] to reduce to [1, 2, 3, 2, 3] As you can see, Array#uniq won't work in this case. I have the following, which works: (a.size - 1).downto(1) { |i| a[i] = nil if a[i - 1] == a[i] } Can anyone come up with something less ugly?

    Read the article

  • XSL template structure for choosing latest event.

    - by Deborah Klenke
    I have a list grouped by Region and it currently shows all the items for each city. I want to reduce to only the most recent advisory for each city. I have tried to use an xsl:for-each statement but I am messing up the names/parameters. List is called mlc The list contains the fields: Title City Region Advisory DateCreated TT (calculated number field to find the number of minutes from the DateCreated to end of today which I intended to use the smallest to find the most recent) I have the list grouped by Region and it currently shows all the items for each city. I want to reduce to only the most recent advisory for each city.

    Read the article

  • generate an array from an array with conditions

    - by Aman
    Suppose i have an array $x = (31,12,13,25,18,10); I want to reduce this array in such a way that the value of each array element is 32. so after work my array will become $newx = (32,32,32,13); I have to generate this array in such a way the sum of array values is never greater than 32. so to create first value, i will reduce 1 from second index value i.e. 12, so the second value will become 11 and first index value will become 31+1 =32. This process should continue so that each array value becomes equal to 32.

    Read the article

  • Postfix and right-associative operators in LR(0) parsers

    - by Ian
    Is it possible to construct an LR(0) parser that could parse a language with both prefix and postfix operators? For example, if I had a grammar with the + (addition) and ! (factorial) operators with the usual precedence then 1+3! should be 1 + 3! = 1 + 6 = 7, but surely if the parser were LR(0) then when it had 1+3 on the stack it would reduce rather than shift? Also, do right associative operators pose a problem? For example, 2^3^4 should be 2^(3^4) but again, when the parser have 2^3 on the stack how would it know to reduce or shift? If this isn't possible is there still a way to use an LR(0) parser, possibly by converting the input into Polish or Reverse Polish notation or adding brackets in the appropriate places? Would this be done before, during or after the lexing stage?

    Read the article

  • Many Associations Leading to Slow Query

    - by Joey Cadle
    I currently have a database that has a lot of many to many associations. I have services which have many variations which have many staff who can perform the variation who then have details on themselves like name, role, etc... At 10 services with 3 variations each and up to 4 out of 20 staff attached to each service even doing something as getting all variations and the staff associated with them takes 4s. Is there a way I can reduce these queries that take a while to process? I've cut down the queries by doing eager loading in my DBM to reduce the problems that arise from 1+N issues, but still 4s is a long query for just a testing stage. Is there a structure out there that would help make such nested many to many associations much quicker to select? Maybe combining everything past the service level into a single table with a 'TYPE' column ?? I'm just not knowledgable enough to know the solution that turns this 4s query into a 300MS query... Any suggestions would be helpful.

    Read the article

  • C# - periodic data reading and Thread.Sleep()

    - by CaldonCZE
    Hello, my C# application reads data from special USB device. The data are read as so-called "messages", each of them having 24 bytes. The amount of messages that must be read per second may differ (maximal frequency is quite high, about 700 messages per second), but the application must read them all. The only way to read the messages is by calling function "ReadMessage", that returns one message read from the device. The function is from external DLL and I cannot modify it. My solution: I've got a seperate thread, that is running all the time during the program run and it's only job is to read the messages in cycle. The received messages are then processed in main application thread. The function executed in the "reading thread" is the following: private void ReadingThreadFunction() { int cycleCount; try { while (this.keepReceivingMessages) { cycleCount++; TRxMsg receivedMessage; ReadMessage(devHandle, out receivedMessage); //...do something with the message... } } catch { //... catch exception if reading failed... } } This solution works fine and all messages are correctly received. However, the application consumes too much resources, the CPU of my computer runs at more than 80%. Therefore I'd like to reduce it. Thanks to the "cycleCount" variable I know that the "cycling speed" of the thread is about 40 000 cycles per second. This is unnecessarily too much, since I need to receive maximum 700 messagges/sec. (and the device has buffer for about 100 messages, so the cycle speed can be even a little lower) I tried to reduce the cycle speed by suspending the thread for 1 ms by Thread.Sleep(1); command. Of course, this didn't work and the cycle speed became about 70 cycles/second which was not enough to read all messages. I know that this attempt was silly, that putting the thread to sleep and then waking him up takes much longer than 1 ms. However, I don't know what else to do: Is there some other way how to slow the thread execution down (to reduce CPU consumption) other than Thread.Sleep? Or am I completely wrong and should I use something different for this task instead of Thread, maybe Threading.Timer or ThreadPool? Thanks a lot in advance for all suggestions. This is my first question here and I'm a beginner at using threads, so please excuse me if it's not clear enough.

    Read the article

  • Who owes who money optimisation problem

    - by Francis
    Say you have n people, each who owe each other money. In general it should be possible to reduce the amount of transactions that need to take place. i.e. if X owes Y £4 and Y owes X £8, then Y only needs to pay X £4 (1 transaction instead of 2). This becomes harder when X owes Y, but Y owes Z who owes X as well. I can see that you can easily calculate one particular cycle. It helps for me when I think of it as a fully connected graph, with the nodes being the amount each person owes. Problem seems to be NP-complete, but what kind of optimisation algorithm could I make, nevertheless, to reduce the total amount of transactions? Doesn't have to be that efficient, as N is quite small for me.

    Read the article

  • How to understand the functional programming code for converting IP string to a number?

    - by zfz
    In a python discusion, I saw a way to convert IP string to a integer in functional progamming way. Here is the Link . The function is implemented in a single line. def ipnumber(ip): return reduce(lambda sum, chunk: sum <<8 | chunk, map(int, ip.split("."))) However, I have few ideas of funcional programming. Could anybody explain the function in detail? I've some knowledg of "map" and "reduce". But I don't konw what "|" and "chunk" mean here? Thanks.

    Read the article

  • Knowledge mining using Hadoop.

    - by Anurag
    Hello there, I want to do a project Hadoop and map reduce and present it as my graduation project. To this, I've given some thought,searched over the internet and came up with the idea of implementing some basic knowledge mining algorithms say on a social websites like Facebook or may stckoverflow, Quora etc and draw some statistical graphs, comparisons frequency distributions and other sort of important values.For searching purpose would it be wise to use Apache Solr ? I want know If such thing is feasible using the above mentioned tools, if so how should I build up on this little idea? Where can I learn about knowledge mining algorithms which are easy to implement using java and map reduce techniques? In case this is a wrong idea please suggest what else can otherwise be done on using Hadoop and other related sub-projects? Thank you

    Read the article

< Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25  | Next Page >