Search Results

Search found 2628 results on 106 pages for 'lexical analysis'.

Page 58/106 | < Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65 | Next Page >

The Data Scientist

- by BuckWoody

A new term - well, perhaps not that new - has come up and I’m actually very excited about it. The term is Data Scientist, and since it’s new, it’s fairly undefined. I’ll explain what I think it means, and why I’m excited about it. In general, I’ve found the term deals at its most basic with analyzing data. Of course, we all do that, and the term itself in that definition is redundant. There is no science that I know of that does not work with analyzing lots of data. But the term seems to refer to more than the common practices of looking at data visually, putting it in a spreadsheet or report, or even using simple coding to examine data sets. The term Data Scientist (as far as I can make out this early in it’s use) is someone who has a strong understanding of data sources, relevance (statistical and otherwise) and processing methods as well as front-end displays of large sets of complicated data. Some - but not all - Business Intelligence professionals have these skills. In other cases, senior developers, database architects or others fill these needs, but in my experience, many lack the strong mathematical skills needed to make these choices properly. I’ve divided the knowledge base for someone that would wear this title into three large segments. It remains to be seen if a given Data Scientist would be responsible for knowing all these areas or would specialize. There are pretty high requirements on the math side, specifically in graduate-degree level statistics, but in my experience a company will only have a few of these folks, so they are expected to know quite a bit in each of these areas. Persistence The first area is finding, cleaning and storing the data. In some cases, no cleaning is done prior to storage - it’s just identified and the cleansing is done in a later step. This area is where the professional would be able to tell if a particular data set should be stored in a Relational Database Management System (RDBMS), across a set of key/value pair storage (NoSQL) or in a file system like HDFS (part of the Hadoop landscape) or other methods. Or do you examine the stream of data without storing it in another system at all? This is an important decision - it’s a foundation choice that deals not only with a lot of expense of purchasing systems or even using Cloud Computing (PaaS, SaaS or IaaS) to source it, but also the skillsets and other resources needed to care and feed the system for a long time. The Data Scientist sets something into motion that will probably outlast his or her career at a company or organization. Often these choices are made by senior developers, database administrators or architects in a company. But sometimes each of these has a certain bias towards making a decision one way or another. The Data Scientist would examine these choices in light of the data itself, starting perhaps even before the business requirements are created. The business may not even be aware of all the strategic and tactical data sources that they have access to. Processing Once the decision is made to store the data, the next set of decisions are based around how to process the data. An RDBMS scales well to a certain level, and provides a high degree of ACID compliance as well as offering a well-known set-based language to work with this data. In other cases, scale should be spread among multiple nodes (as in the case of Hadoop landscapes or NoSQL offerings) or even across a Cloud provider like Windows Azure Table Storage. In fact, in many cases - most of the ones I’m dealing with lately - the data should be split among multiple types of processing environments. This is a newer idea. Many data professionals simply pick a methodology (RDBMS with Star Schemas, NoSQL, etc.) and put all data there, regardless of its shape, processing needs and so on. A Data Scientist is familiar not only with the various processing methods, but how they work, so that they can choose the right one for a given need. This is a huge time commitment, hence the need for a dedicated title like this one. Presentation This is where the need for a Data Scientist is most often already being filled, sometimes with more or less success. The latest Business Intelligence systems are quite good at allowing you to create amazing graphics - but it’s the data behind the graphics that are the most important component of truly effective displays. This is where the mathematics requirement of the Data Scientist title is the most unforgiving. In fact, someone without a good foundation in statistics is not a good candidate for creating reports. Even a basic level of statistics can be dangerous. Anyone who works in analyzing data will tell you that there are multiple errors possible when data just seems right - and basic statistics bears out that you’re on the right track - that are only solvable when you understanding why the statistical formula works the way it does. And there are lots of ways of presenting data. Sometimes all you need is a “yes” or “no” answer that can only come after heavy analysis work. In that case, a simple e-mail might be all the reporting you need. In others, complex relationships and multiple components require a deep understanding of the various graphical methods of presenting data. Knowing which kind of chart, color, graphic or shape conveys a particular datum best is essential knowledge for the Data Scientist. Why I’m excited I love this area of study. I like math, stats, and computing technologies, but it goes beyond that. I love what data can do - how it can help an organization. I’ve been fortunate enough in my professional career these past two decades to work with lots of folks who perform this role at companies from aerospace to medical firms, from manufacturing to retail. Interestingly, the size of the company really isn’t germane here. I worked with one very small bio-tech (cryogenics) company that worked deeply with analysis of complex interrelated data. So watch this space. No, I’m not leaving Azure or distributed computing or Microsoft. In fact, I think I’m perfectly situated to investigate this role further. We have a huge set of tools, from RDBMS to Hadoop to allow me to explore. And I’m happy to share what I learn along the way.

Read the article
Columnstore Case Study #2: Columnstore faster than SSAS Cube at DevCon Security

- by aspiringgeek

Preamble This is the second in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014. Many of these can be found in my big deck along with details such as internals, best practices, caveats, etc. The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. See also Columnstore Case Study #1: MSIT SONAR Aggregations Why Columnstore? As stated previously, If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps. In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data. The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. The Customer DevCon Security provides home & business security services & has been in business for 135 years. I met DevCon personnel while speaking to the Utah County SQL User Group on 20 February 2012. (Thanks to TJ Belt (b|@tjaybelt) & Ben Miller (b|@DBADuck) for the invitation which serendipitously coincided with the height of ski season.) The App: DevCon Security Reporting: Optimized & Ad Hoc Queries DevCon users interrogate a SQL Server 2012 Analysis Services cube via SSRS. In addition, the SQL Server 2012 relational back end is the target of ad hoc queries; this DW back end is refreshed nightly during a brief maintenance window via conventional table partition switching. SSRS, SSAS, & MDX Conventional relational structures were unable to provide adequate performance for user interaction for the SSRS reports. An SSAS solution was implemented requiring personnel to ramp up technically, including learning enough MDX to satisfy requirements. Ad Hoc Queries Even though the fact table is relatively small—only 22 million rows & 33GB—the table was a typical DW table in terms of its width: 137 columns, any of which could be the target of ad hoc interrogation. As is common in DW reporting scenarios such as this, it is often nearly to optimize for such queries using conventional indexing. DevCon DBAs & developers attended PASS 2012 & were introduced to the marvels of columnstore in a session presented by Klaus Aschenbrenner (b|@Aschenbrenner) The Details Classic vs. columnstore before-&-after metrics are impressive. Scenario Conventional Structures Columnstore Δ SSRS via SSAS 10 - 12 seconds 1 second >10x Ad Hoc 5-7 minutes (300 - 420 seconds) 1 - 2 seconds >100x Here are two charts characterizing this data graphically. The first is a linear representation of Report Duration (in seconds) for Conventional Structures vs. Columnstore Indexes. As is so often the case when we chart such significant deltas, the linear scale doesn’t expose some the dramatically improved values corresponding to the columnstore metrics. Just to make it fair here’s the same data represented logarithmically; yet even here the values corresponding to 1 –2 seconds aren’t visible. The Wins Performance: Even prior to columnstore implementation, at 10 - 12 seconds canned report performance against the SSAS cube was tolerable. Yet the 1 second performance afterward is clearly better. As significant as that is, imagine the user experience re: ad hoc interrogation. The difference between several minutes vs. one or two seconds is a game changer, literally changing the way users interact with their data—no mental context switching, no wondering when the results will appear, no preoccupation with the spinning mind-numbing hurry-up-&-wait indicators. As we’ve commonly found elsewhere, columnstore indexes here provided performance improvements of one, two, or more orders of magnitude. Simplified Infrastructure: Because in this case a nonclustered columnstore index on a conventional DW table was faster than an Analysis Services cube, the entire SSAS infrastructure was rendered superfluous & was retired. PASS Rocks: Once again, the value of attending PASS is proven out. The trip to Charlotte combined with eager & enquiring minds let directly to this success story. Find out more about the next PASS Summit here, hosted this year in Seattle on November 4 - 7, 2014. DevCon BI Team Lead Nathan Allan provided this unsolicited feedback: “What we found was pretty awesome. It has been a game changer for us in terms of the flexibility we can offer people that would like to get to the data in different ways.” Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing. I have documented here, the second in a series of reports on columnstore implementations, results from DevCon Security, a live customer production app for which performance increased by factors of from 10x to 100x for all report queries, including canned queries as well as reducing time for results for ad hoc queries from 5 - 7 minutes to 1 - 2 seconds. As a result of columnstore performance, the customer retired their SSAS infrastructure. I invite you to consider leveraging columnstore in your own environment. Let me know if you have any questions.

Read the article
Profiling Startup Of VS2012 – SpeedTrace Profiler

- by Alois Kraus

SpeedTrace is a relatively unknown profiler made a company called Ipcas. A single professional license does cost 449€+VAT. For the test I did use SpeedTrace 4.5 which is currently Beta. Although it is cheaper than dotTrace it has by far the most options to influence how profiling does work. First you need to create a tracing project which does configure tracing for one process type. You can start the application directly from the profiler or (much more interesting) it does attach to a specific process when it is started. For this you need to check “Trace the specified …” radio button and enter the process name in the “Process Name of the Trace” edit box. You can even selectively enable tracing for processes with a specific command line. Then you need to activate the trace project by pressing the Activate Project button and you are ready to start VS as usual. If you want to profile the next 10 VS instances that you start you can set the Number of Processes counter to e.g. 10. This is immensely helpful if you are trying to profile only the next 5 started processes. As you can see there are many more tabs which do allow to influence tracing in a much more sophisticated way. SpeedTrace is the only profiler which does not rely entirely on the profiling Api of .NET. Instead it does modify the IL code (instrumentation on the fly) to write tracing information to disc which can later be analyzed. This approach is not only very fast but it does give you unprecedented analysis capabilities. Once the traces are collected they do show up in your workspace where you can open the trace viewer. I do skip the other windows because this view is by far the most useful one. You can sort the methods not only by Wall Clock time but also by CPU consumption and wait time which none of the other products support in their views at the same time. If you want to optimize for CPU consumption sort by CPU time. If you want to find out where most time is spent you need Clock Total time and Clock Waiting. There you can directly see if the method did take long because it did wait on something or it did really execute stuff that did take so long. Once you have found a method you want to drill deeper you can double click on a method to get to the Caller/Callee view which is similar to the JetBrains Method Grid view. But this time you do see much more. In the middle is the clicked method. Above are the methods that call you and below are the methods that you do directly call. Normally you would then start digging deeper to find the end of the chain where the slow method worth optimizing is located. But there is a shortcut. You can press the magic button to calculate the aggregation of all called methods. This is displayed in the lower left window where you can see each method call and how long it did take. There you can also sort to see if this call stack does only contain methods (e.g. WCF connect calls which you cannot make faster) not worth optimizing. YourKit has a similar feature where it is called Callees List. In the Functions tab you have in the context menu also many other useful analysis options One really outstanding feature is the View Call History Drilldown. When you select this one you get not a sum of all method invocations but a list with the duration of each method call. This is not surprising since SpeedTrace does use tracing to get its timings. There you can get many useful graphs how this method did behave over time. Did it become slower at some point in time or was only the first call slow? The diagrams and the list will tell you that. That is all fine but what should I do when one method call was slow? I want to see from where it was coming from. No problem select the method in the list hit F10 and you get the call stack. This is a life saver if you e.g. search for serialization problems. Today Serializers are used everywhere. You want to find out from where the 5s XmlSerializer.Deserialize call did come from? Hit F10 and you get the call stack which did invoke the 5s Deserialize call. The CPU timeline tab is also useful to find out where long pauses or excessive CPU consumption did happen. Click in the graph to get the Thread Stacks window where you can get a quick overview what all threads were doing at this time. This does look like the Stack Traces feature in YourKit. Only this time you get the last called method first which helps to quickly see what all threads were executing at this moment. YourKit does generate a rather long list which can be hard to go through when you have many threads. The thread list in the middle does not give you call stacks or anything like that but you see which methods were found most often executing code by the profiler which is a good indication for methods consuming most CPU time. This does sound too good to be true? I have not told you the best part yet. The best thing about this profiler is the staff behind it. When I do see a crash or some other odd behavior I send a mail to Ipcas and I do get usually the next day a mail that the problem has been fixed and a download link to the new version. The guys at Ipcas are even so helpful to log in to your machine via a Citrix Client to help you to get started profiling your actual application you want to profile. After a 2h telco I was converted from a hater to a believer of this tool. The fast response time might also have something to do with the fact that they are actively working on 4.5 to get out of the door. But still the support is by far the best I have encountered so far. The only downside is that you should instrument your assemblies including the .NET Framework to get most accurate numbers. You can profile without doing it but then you will see very high JIT times in your process which can severely affect the correctness of the measured timings. If you do not care about exact numbers you can also enable in the main UI in the Data Trace tab logging of method arguments of primitive types. If you need to know what files at which times were opened by your application you can find it out without a debugger. Since SpeedTrace does read huge trace files in its reader you should perhaps use a 64 bit machine to be able to analyze bigger traces as well. The memory consumption of the trace reader is too high for my taste. But they did promise for the next version to come up with something much improved.

Read the article
How to Identify Which Hardware Component is Failing in Your Computer

- by Chris Hoffman

Concluding that your computer has a hardware problem is just the first step. If you’re dealing with a hardware issue and not a software issue, the next step is determining what hardware problem you’re actually dealing with. If you purchased a laptop or pre-built desktop PC and it’s still under warranty, you don’t need to care about this. Have the manufacturer fix the PC for you — figuring it out is their problem. If you’ve built your own PC or you want to fix a computer that’s out of warranty, this is something you’ll need to do on your own. Blue Screen 101: Search for the Error Message This may seem like obvious advice, but searching for information about a blue screen’s error message can help immensely. Most blue screens of death you’ll encounter on modern versions of Windows will likely be caused by hardware failures. The blue screen of death often displays information about the driver that crashed or the type of error it encountered. For example, let’s say you encounter a blue screen that identified “NV4_disp.dll” as the driver that caused the blue screen. A quick Google search will reveal that this is the driver for NVIDIA graphics cards, so you now have somewhere to start. It’s possible that your graphics card is failing if you encounter such an error message. Check Hard Drive SMART Status Hard drives have a built in S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) feature. The idea is that the hard drive monitors itself and will notice if it starts to fail, providing you with some advance notice before the drive fails completely. This isn’t perfect, so your hard drive may fail even if SMART says everything is okay. If you see any sort of “SMART error” message, your hard drive is failing. You can use SMART analysis tools to view the SMART health status information your hard drives are reporting. Test Your RAM RAM failure can result in a variety of problems. If the computer writes data to RAM and the RAM returns different data because it’s malfunctioning, you may see application crashes, blue screens, and file system corruption. To test your memory and see if it’s working properly, use Windows’ built-in Memory Diagnostic tool. The Memory Diagnostic tool will write data to every sector of your RAM and read it back afterwards, ensuring that all your RAM is working properly. Check Heat Levels How hot is is inside your computer? Overheating can rsult in blue screens, crashes, and abrupt shut downs. Your computer may be overheating because you’re in a very hot location, it’s ventilated poorly, a fan has stopped inside your computer, or it’s full of dust. Your computer monitors its own internal temperatures and you can access this information. It’s generally available in your computer’s BIOS, but you can also view it with system information utilities such as SpeedFan or Speccy. Check your computer’s recommended temperature level and ensure it’s within the appropriate range. If your computer is overheating, you may see problems only when you’re doing something demanding, such as playing a game that stresses your CPU and graphics card. Be sure to keep an eye on how hot your computer gets when it performs these demanding tasks, not only when it’s idle. Stress Test Your CPU You can use a utility like Prime95 to stress test your CPU. Such a utility will fore your computer’s CPU to perform calculations without allowing it to rest, working it hard and generating heat. If your CPU is becoming too hot, you’ll start to see errors or system crashes. Overclockers use Prime95 to stress test their overclock settings — if Prime95 experiences errors, they throttle back on their overclocks to ensure the CPU runs cooler and more stable. It’s a good way to check if your CPU is stable under load. Stress Test Your Graphics Card Your graphics card can also be stress tested. For example, if your graphics driver crashes while playing games, the games themselves crash, or you see odd graphical corruption, you can run a graphics benchmark utility like 3DMark. The benchmark will stress your graphics card and, if it’s overheating or failing under load, you’ll see graphical problems, crashes, or blue screens while running the benchmark. If the benchmark seems to work fine but you have issues playing a certain game, it may just be a problem with that game. Swap it Out Not every hardware problem is easy to diagnose. If you have a bad motherboard or power supply, their problems may only manifest through occasional odd issues with other components. It’s hard to tell if these components are causing problems unless you replace them completely. Ultimately, the best way to determine whether a component is faulty is to swap it out. For example, if you think your graphics card may be causing your computer to blue screen, pull the graphics card out of your computer and swap in a new graphics card. If everything is working well, it’s likely that your previous graphics card was bad. This isn’t easy for people who don’t have boxes of components sitting around, but it’s the ideal way to troubleshoot. Troubleshooting is all about trial and error, and swapping components out allows you to pin down which component is actually causing the problem through a process of elimination. This isn’t a complete guide to everything that could likely go wrong and how to identify it — someone could write a full textbook on identifying failing components and still not cover everything. But the tips above should give you some places to start dealing with the more common problems. Image Credit: Justin Marty on Flickr

Read the article
Why JSF Matters (to You)

- by reza_rahman

"Those who have knowledge, don’t predict. Those who predict, don’t have knowledge." – Lao Tzu You may have noticed Thoughtworks recently crowned the likes AngularJS, etc imminent successors to server-side web frameworks. They apparently also deemed it necessary to single out JSF for righteous scorn. I have to say as I was reading the analysis I couldn't help but remember they also promptly jumped on the Ruby, Rails, Clojure, etc bandwagon a good few years ago seemingly similarly crowing these dynamic languages imminent successors to Java. I remember thinking then as I do now whether the folks at Thoughtworks are really that much smarter than me or if they are simply more prone to the Hipster buzz of the day. I'll let you make the final call on that one. I also noticed mention of "J2EE" in the context of JSF and had to wonder how up-to-date or knowledgeable the person writing the analysis actually was given that the term was basically retired almost a decade ago. There's one thing that I am absolutely sure about though - as a long time pretty happy user of JSF, I had no choice but to speak up on what I believe JSF offers. If you feel the same way, I would encourage you to support the team behind JSF whose hard work you may have benefited from over the years. True to his outspoken character PrimeFaces lead Cagatay Civici certainly did not mince words making the case for the JSF ecosystem - his excellent write-up is well worth a read. He specifically pointed out the practical problems in going whole hog with bare metal JavaScript, CSS, HTML for many development teams. I'll admit I had to smile when I read his closing sentence as well as the rather cheerful comments to the post from actual current JSF/PrimeFaces users that are apparently supposed to be on a gloomy death march. In a similar vein, OmniFaces developer Arjan Tijms did a great job pointing out the fact that despite the extremely competitive server-side Java Web UI space, JSF seems to manage to always consistently come out in either the number one or number two spot over many years and many data sources - do give his well-written message in the JAX-RS user forum a careful read. I don't think it's really reasonable to expect this to be the case for so many years if JSF was not at least a capable if not outstanding technology. If fact if you've ever wondered, Oracle itself is one of the largest JSF users on the planet. As Oracle's Shay Shmeltzer explains in a recent JSF Central interview, many of Oracle's strategic products such as ADF, ADF Mobile and Fusion Applications itself is built on JSF. There are well over 3,000 active developers working on these codebases. I don't think anyone can think of a more compelling reason to make sure that a technology is as effective as possible for practical development under real world conditions. Standing on the shoulders of the above giants, I feel like I can be pretty brief in making my own case for JSF: JSF is a powerful abstraction that brings the original Smalltalk MVC pattern to web development. This means cutting down boilerplate code to the bare minimum such that you really can think of just writing your view markup and then simply wire up some properties and event handlers on a POJO. The best way to see what this really means is to compare JSF code for a pretty small case to other approaches. You should then multiply the additional work for the typical enterprise project to try to understand what the productivity trade-offs are. This is reason alone for me to personally never take any other approach seriously as my primary web UI solution unless it can match the sheer productivity of JSF. Thanks to JSF's focus on components from the ground-up JSF has an extremely strong ecosystem that includes projects like PrimeFaces, RichFaces, OmniFaces, ICEFaces and of course ADF Faces/Mobile. These component libraries taken together constitute perhaps the largest widget set ever developed and optimized for a single web UI technology. To begin to grasp what this really means, just briefly browse the excellent PrimeFaces showcase and think about the fact that you can readily use the widgets on that showcase by just using some simple markup and knowing near to nothing about AJAX, JavaScript or CSS. JSF has the fair and legitimate advantage of being an open vendor neutral standard. This means that no single company, individual or insular clique controls JSF - openness, transparency, accountability, plurality, collaboration and inclusiveness is virtually guaranteed by the standards process itself. You have the option to choose between compatible implementations, escape any form of lock-in or even create your own compatible implementation! As you might gather from the quote at the top of the post, I am not a fan of crystal ball gazing and certainly don't want to engage in it myself. Who knows? However far-fetched it may seem maybe AngularJS is the only future we all have after all. If that is the case, so be it. Unlike what you might have been told, Java EE is about choice at heart and it can certainly work extremely well as a back-end for AngularJS. Likewise, you are also most certainly not limited to just JSF for working with Java EE - you have a rich set of choices like Struts 2, Vaadin, Errai, VRaptor 4, Wicket or perhaps even the new action-oriented web framework being considered for Java EE 8 based on the work in Jersey MVC... Please note that any views expressed here are my own only and certainly does not reflect the position of Oracle as a company.

Read the article
WinSat command line closes too fast

- by Rob Cowell

I'm trying to do some analysis under Windows 7 as to why I can't get a Windows Experience Index (WEI) rating due to disk issues. To this end, I'm trying to run winsat from the command line with :- winsat disk -seq -read -drive c and winsat disk -ran -write -n 2 but the command window is closing too quickly to be able to read the results. I've tried opening a seperate cmd window to run it in but it still insists on launching its own window to run in, closing straight away. Any idea how I can see the output?

Read the article
Photoshop CS6 beta - functionality

- by Biker John

Is Photoshop beta cs6 full featured, or is just a preview of SOME of the new functions? I want to make sure before I remove my cs5 version. There are two contradictions that are making me unsure (as written on the official cs6 beta download page): Explore Photoshop CS6 beta for a sneak preview of some of the incredible performance enhancements, imaging magic, and creativity tools we are working on... AND Photoshop CS6 beta includes all the features in Photoshop CS6 and Photoshop CS6 Extended. Take this opportunity to try out the 3D image editing and quantitative image analysis capabilities of Photoshop Extended*, but note that—while these features will be included in the shipping version of Photoshop CS6 Extended—they will not be included in the shipping version of Photoshop CS6...

Read the article
How can I get metrics such as incoming and outcoming traffic with Apache servers?

- by hhh

Suppose a network consisting of hubs A, B, C, D ... and X. I am looking for ways to visualize how users use the network such as incoming, outgoing and other metrics. In Apache logs, I can see some errs if something did not work but I have no realistic picture about such a system in general i.e. how the system actually works. I am looking for some sort of flow-analysis and I would like to get pure data to create some graph. Then analyze the graph with some metrics where I do not even know the right metrics, perhaps some dispersion metric. My goal is to create some sort of objective way to judge quality.

Read the article
Are there any tools for monitoring individual Apache virtual hosts in real-time?

- by Dave Forgac

I'm looking for a way to monitor and record Apache traffic, separated by virtual host. I am currently using Munin to capture this and other data for the entire server however I can't seem to find a way to do this by vhost. This link describes using a module called mod_watch which is apparently no longer in development: http://www.freshnet.org/wordpress/2007/03/08/monitoring-apaches-virtualhost-with-munin/ The file that is listed as being compatible with Apache 2.x is reported to have problems with missing vhosts an reporting data correctly. Does anyone know of a reliable way to determine real-time traffic per vhost? If I can find this it should be easy enough to write a new Munin plugin. Edit: What I'd really like to see is something similar to the Apache server-status scoreboard page with the number of connections / requests separated by virtual host. This would give me the ability to check which vhost may be experiencing a spike in traffic in real time and would also provide the data needed for a Munin module (or some alternative performance monitoring / analysis system.)

Read the article
Cannot read/access Apache2 access logs

- by webworm

I have been asked to take a look at some access logs for an Apcahe2 web server running on Ubuntu. I have been told by the administrator of the machine that my login has "admin" access yet I cannot seem to copy the access logs from Apache2 to my local machine via FTP for analysis. I figure one of two things is happening ... I don't really have full admin access Some other process (perhaps Apache2) has control of the log files and won't let me copy them. How can I tell if I truly have admin access? What type of access do I need to request? Root access? Something else? Should I be able to copy these log files with admin access?

Read the article
SQL log shipping for reporting

- by Patrick J Collins

I would like to create a read-only copy of my SQL Server 2008 database on a secondary server for reporting and analysis. I've been testing log shipping, configured to run every 5 minutes or so. Alas, there appears to be a stumbling block, for exclusive access is required on the target database during the restore, which in turn requires killing all active connections. This is far from ideal, especially if a user is in the middle of running a report. Any better suggestions? Edit : I'm doing this on the Express edition.

Read the article
Flexible traffic & bandwidth monitor

- by BrNathan

I have looked around, but have not found anything to meet our needs. I need something that can log all connections & bandwidth consumption. We need it for analysis: by protocol, source IP (& MAC if possible), destination, etc. Ideally we are looking for something that can produce custom graphs & also uses mysql. All connections go through one server on a bridged connection (2 network cards) so it is easy to pickup traffic. We are not concerned so much with internal LAN traffic as what passes in & out to the firewall. Thanks for you suggestions. Update: I use Ubuntu 10.04

Read the article
solr administration

- by devrick0

Does anyone have any notes for an sysadmin supporting solr? I'm looking for anything that might be useful for monitoring & metrics as well as troubleshooting. Some useful links I have found are: /solr/admin/stats.jsp and /solr/admin/analysis.jsp In the logs I have noticed, other than the query, "hits", "status" and "QTime" values. The documentation on what these mean is sparse at least based on the 100+ websites I have checked. QTime appears to be the query time response in milliseconds. Hits is some form of results but I'm not sure exactly what makes that up and I'm not sure about status. Typically I see status come back as "0" but I have seen other numbers such as "5", so my thoughts that it could be either HTTP status codes or a 0 or 1 (good or bad) methodology isn't accurate. All of the documentation I have come across is intended for developers. Any sysadmin-centric documentation would be a big help.

Read the article
Identical traffic

- by Walter White

Hi all, I am running an application server and logging all requests for analysis purposes later. One interesting trend I noticed last night was, I had a visitor from Texas on FIOS share identical traffic with bluecoat in California. What would cause the traffic to be identical? For every request the visitor made, bluecoat made one subsequently within milliseconds of his request. If it is caching, why would there be identical requests? Wouldn't it go through the cache / proxy on their end, and I would only see the proxied request? I'm just curious, this is an interesting pattern that shows similarities of a DDoS attack, but with far fewer resources. Is it possible that the visitor had malware on their computer? Any other ideas? Walter

Read the article
Stop SQL Server services from conveniently

- by MedicineMan

I have a general use laptop. I use it for games, development, and web surfing. I've just installed SQL Server 2008 with Analysis, Reporting, and Error reporting, as well as any of the other options on the installer. I also have a default instance of SQL server as well as a named instance. When I'm not doing development, I'd like to shut down these services conveniently. I'm thinking that a batch file would be good. What are the commands to shut these services down and release the associated memory and resources? It appears that: net stop MSSQLSERVER seems to stop the MSSQLSERVER instance. What about the other services?

Read the article
Intel Core i7-4960HQ vs. 4850HQ (Haswell) [on hold]

- by Timothy R. Butler

I'm looking at the new MacBook Pros and trying to decide between the Core i7-4960HQ (2.6 GHz) and i7-4850 (2.3 GHz). I've found some synthetic benchmarks comparing them, but I haven't found a lot of data, so I'd appreciate any pointers to good comparisons for the Haswell family (especially these two processors). My cursory analysis seems to suggest there isn't a huge gain from the extra 300 MHz. I'd like to determine not only whether this is generally true, but also to figure out if the gains that are made in performance come at too high of cost. Is the 2.6 going to be pushing the limits of what can fit in a thin laptop without overheating? I've looked at some of Intel's documentation, but have not been able to determine what the normal and maximum operating temperature differences are for the models. In the past, there have been times that Intel's fastest models in a given range ran especially hot and/or consumed significantly more power compared to slightly slower models. Do those concerns factor into the current generation?

Read the article
Awstats messaging non existant user causing exim4 to go nuts

- by Chris

I've taken over managing a server set up by someone else now uncontactable, while managing to work out most faults / changes needed this one is stumping me. Awstats is running on the machine and sending messages via exim4 to a user every time it runs an update. The user account has been deleted and so the exim4 main log files are filling up with message delivery errors, which firstly hinders meaningful log analysis for anything else and secondly uses up quite a lot of space (it grew to 22GB unattended, panic!) I've been through all the conf files in /etc/awstats and can't seem to find any mention of this user account. Google just turns up results about how to use awstats to parse exim4 log files. So the questions is where is this setting (on debian) likely to be? Cheers in advance

Read the article
Finding trends in multi-category data in Excel

- by Miral

I have an Excel spreadsheet that contains hundreds of rows of data that each represent a single sample in a larger population. Each row is divided into three columns that contain frequency counts of a specific type of thing. Together the three columns summed on a single row represent 100%, though each row will sum to a different value. What I'm most interested in are the proportions of each of these types (ie. percentages of each column relative to the sum of the three columns). I can easily calculate this on a per-row basis, but what I'm really interested in is trying to find an overall trend from the entire population. I don't really spend much time doing data analysis so the only thing I can think of trying is to create those percentage columns and then average them, but I'm sure there must be a better way to visualise this.

Read the article
Is there any SMS/MMS server for LAN environment

- by Chau Chee Yang

I am looking for a solution to send SMS/MMS message to mobile device from desktop or browser in LAN environment. As such, it is most probably using TCP/IP protocol to transmit request/response. The server may attach to a GSM device with SIM card attached. An server application would then start accept the request from any LAN client and convey the SMS/MMS to one or more recipients. The server may log all requests for further traffic analysis in later stage. Is there any solution that able to perform what I describe here. Please advice.

Read the article
What is the S.M.A.R.T. page?

- by Mads Skjern

I've just listened to Steve Gibson talk about his SpinRite software, on the Security Now podcast episode 336 (transscript). At 33:20 he says: I can show and do show on the SMART page that sectors are being relocated and that errors are being corrected. That SMART analysis page sometimes scares people because it shows, wait a minute, this thing says we're correcting so many errors per megabyte. What is this SMART page? 1) Some information saved on the HD by SMART, that I can access with a SMART tool like smartmontools? 2) A page (tab) in his SpinRite software? In any case, can I see, in any way, what sectors are marked as bad, without using SpinRite? Preferably using smartmontools!

Read the article
Packet logging on PIX firewall

- by georged.id.auindex.htm

We have a Cisco PIX 515 firewall and I would like to set up a simple logging that would give us a traffic breakdown for billing by: source destination protocol port size time PIX is plugged into Catalyst 2970 and I was told that the best thing since sliced bread for logging is to get Netflow and get Catalyst to log. My concern, however, (besides the Netflow cost) is that I really don't want to "listen" to the internal noise and all I'm interested in are the external traffic stats above for billing and analysis purposes. What would be the simplest and the easiest solution? Cheers George

Read the article
Verify server performance

- by George Kesler

I'm looking for a quick and SIMPLE way to verify that new servers are performing as expected. The most important metric is disk performance, second is network performance. I’m trying to prevent problems caused by misconfiguration of RAID arrays, NIC teaming etc. The solution should work with both physical and virtual servers. I don’t need sophisticated analysis with different workloads, just one set of benchmarks which I would run against a reference server and later compare to new ones. One problem is that most benchmarks are not giving accurate results when running on a VM.

Read the article
Analyzing Linux NFS server performance

- by Kamil Kisiel

I'd like to do some analysis of our NFS server to help track down potential bottlenecks in our applications. The server is running SUSE Enterprise Linux 10. The kind of things I'm looking to know are: Which files are being accessed by which clients Read/write throughput on a per-client basis Overhead imposed by other RPC calls Time spent waiting on other NFS requests, or disk I/O, to service a client I already know about the statistics available in /proc/net/rpc/nfsd and in fact I wrote a blog post describing them in depth. What I'm looking for is a way to dig deeper and help understand what factors are contributing to the performance seen by a particular client. I want to analyze the role the NFS server plays in the performance of an application on our cluster so that I can think of ways to best optimize it.

Read the article
Cannot connect to a 2008 sql server named instance hosted in a azure virtual machine

- by emardini

When I try to connect to a named instance in a SQL SERVER hosted in a azure VM I get this message: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified) (Microsoft SQL Server, Error: -1) The problem is the sql browser is not working properly, when I start the sql browser service it closes after a few seconds and the event log says "There are no instances of SQL Server or SQL Server Analysis Services." But I do have a named instance, I can connect locally to this instance. I've re-installed sql browser and the instance but ii does not work. The host is a azure virtual machine windows server 2008 datacenter. Please help. Thank you

Read the article
database on SSD: data only or the DBM program too?

- by simone

I plan on moving the data I use for statistical analysis (100-ish Gb) onto an SSD. The data is either sqlite single-file db's, or postgresql-managed data. The SSD is 240 Gb, 550 MB/s read and 520 MB/s write. Should I reserve that space for the data only, or would it be a good idea to install the operating system (Mac OS X) and the application directory (Adobe Suite, Microsoft Office and the like) on the SSD too? And would it make a substantial speed difference whether I also install the postgresql binaries on the SSD? I have plenty of other space (another 300Gb hard-drive, and a 1Tb one). Don't know the features of the non-SSD drives, though they're our standard equipment on all Macs, and they're definitely OK. Thanks.

Read the article

Search Results

Search found 2628 results on 106 pages for 'lexical analysis'.

Page 58/106 | < Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65 | Next Page >

- by BuckWoody

- by aspiringgeek

- by Alois Kraus

- by Chris Hoffman

- by reza_rahman

- by Rob Cowell

- by Biker John

- by hhh

- by Dave Forgac

- by webworm

- by Patrick J Collins

- by BrNathan

- by devrick0

- by Walter White

- by MedicineMan

- by Timothy R. Butler

- by Chris

- by Miral

- by Chau Chee Yang

- by Mads Skjern

- by georged.id.auindex.htm

- by George Kesler

- by Kamil Kisiel

- by emardini

- by simone

< Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65 | Next Page >