Search Results

Search found 64472 results on 2579 pages for 'data context'.

Page 55/2579 | < Previous Page | 51 52 53 54 55 56 57 58 59 60 61 62  | Next Page >

  • Algorithm detect repeating/similiar strings in a corpus of data -- say email subjects, in Python

    - by RizwanK
    I'm downloading a long list of my email subject lines , with the intent of finding email lists that I was a member of years ago, and would want to purge them from my Gmail account (which is getting pretty slow.) I'm specifically thinking of newsletters that often come from the same address, and repeat the product/service/group's name in the subject. I'm aware that I could search/sort by the common occurrence of items from a particular email address (and I intend to), but I'd like to correlate that data with repeating subject lines.... Now, many subject lines would fail a string match, but "Google Friends : Our latest news" "Google Friends : What we're doing today" are more similar to each other than a random subject line, as is: "Virgin Airlines has a great sale today" "Take a flight with Virgin Airlines" So -- how can I start to automagically extract trends/examples of strings that may be more similar. Approaches I've considered and discarded ('because there must be some better way'): Extracting all the possible substrings and ordering them by how often they show up, and manually selecting relevant ones Stripping off the first word or two and then count the occurrence of each sub string Comparing Levenshtein distance between entries Some sort of string similarity index ... Most of these were rejected for massive inefficiency or likelyhood of a vast amount of manual intervention required. I guess I need some sort of fuzzy string matching..? In the end, I can think of kludgy ways of doing this, but I'm looking for something more generic so I've added to my set of tools rather than special casing for this data set. After this, I'd be matching the occurring of particular subject strings with 'From' addresses - I'm not sure if there's a good way of building a data structure that represents how likely/not two messages are part of the 'same email list' or by filtering all my email subjects/from addresses into pools of likely 'related' emails and not -- but that's a problem to solve after this one. Any guidance would be appreciated.

    Read the article

  • What changes in Core Data after a save?

    - by Splash6
    I have a Core Data based mac application that is working perfectly well until I save a file. When I save a file it seems like something changes in core data because my original fetch request no longer fetches anything. This is the fetch request that works before saving but returns an empty array after saving. NSEntityDescription *outputCellEntityDescription = [NSEntityDescription entityForName:@"OutputCell" inManagedObjectContext:[[self document] managedObjectContext]]; NSFetchRequest *outputCellRequest = [[[NSFetchRequest alloc] init] autorelease]; [outputCellRequest setEntity:outputCellEntityDescription]; NSPredicate *outputCellPredicate = [NSPredicate predicateWithFormat:@"(cellTitle = %@)", outputCellTitle]; [outputCellRequest setPredicate:outputCellPredicate]; NSError *outputCellError = nil; NSArray *outputCellArray = [[[self document] managedObjectContext] executeFetchRequest:outputCellRequest error:&outputCellError]; I have checked with [[[self document] managedObjectContext] registeredObjects] to see that the object still exists after the save and nothing seems to have changed and the object still exists. It is probably something fairly basic but does anyone know what I might be doing wrong? If not can anyone give me any pointers to what might be different in the Core Data model after a save so I might have some clues why the fetch request stops working after saving?

    Read the article

  • C++: best way to implement globally scoped data

    - by bobobobo
    I'd like to make program-wide data in a C++ program, without running into pesky LNK2005 errors when all the source files #includes this "global variable repository" file. I have 2 ways to do it in C++, and I'm asking which way is better. The easiest way to do it in C# is just public static members. C#: public static class DataContainer { public static Object data1 ; public static Object data2 ; } In C++ you can do the same thing C++ global data way#1: class DataContainer { public: static Object data1 ; static Object data2 ; } ; Object DataContainer::data1 ; Object DataContainer::data2 ; However there's also extern C++ global data way #2: class DataContainer { public: Object data1 ; Object data2 ; } ; extern DataContainer * dataContainer ; // instantiate in .cpp file In C++ which is better, or possibly another way which I haven't thought about? The solution has to not cause LNK2005 "object already defined" errors.

    Read the article

  • strange memory error when deleting object from Core Data

    - by llloydxmas
    I have an application that downloads an xml file, parses the file, and creates core data objects while doing so. In the parse code I have a function called 'emptydatacontext' that removes all items from Core Data before creating replacements from the xml file. This method looks like this: -(void) emptyDataContext { NSMutableArray* mutableFetchResults = [CoreDataHelper getObjectsFromContext:@"Condition" :@"road" :NO :managedObjectContext]; NSFetchRequest * allCon = [[NSFetchRequest alloc] init]; [allCon setEntity:[NSEntityDescription entityForName:@"Condition" inManagedObjectContext:managedObjectContext]]; NSError * error = nil; NSArray * conditions = [managedObjectContext executeFetchRequest:allCon error:&error]; [allCon release]; for (NSManagedObject * condition in conditions) { [managedObjectContext deleteObject:condition]; } } The first time this runs it deletes all objects and functions as it should - creating new objects from the xml file. I created a 'update' button that starts the exact same process of retrieving the file the preceeding as it did the first time. All is well until its time to delete the core data objects again. This 'deleteObject' call creates a "EXC_BAD_ACCESS" error each time. This only happens on the second time through. See this image for the debugger window as it appears when walking through the deletion FOR loop on the second iteration. Conditions is the fetched array of 7 objects with the objects below. Condition should be an individual condition. link text As you can see 'condition' does not match any of the objects in the 'conditions' array. I'm sure this is why I'm getting the memory access errors. Just not sure why this fetch (or the FOR) is returning a wrong reference. All the code that successfully performes this function on the first iteration is used in the second but with very different results. Thanks in advance for the help!

    Read the article

  • MVC and jQuery data retrieval.

    - by user337542
    Hello, I am using mvc and jQuery and I am trying to display someone's profile with some additional institutions that the person belongs to. I am new to this but I've done something like this in ProfileControler: public ActionResult Institutions(int id) { var inst = fr.getInstitutions(id); return Json(inst); } getInstitutions(id) returns Institution objects(with Name, City, Post Code etc.) Then in a certain View I am trying to get the data with jQuery and display them as follows: $(document).ready(function () { $.post("/Profile/Institutions", { id: <%= Model.Profile.userProfileID %> }, function (data) { $.each(data, function () { var new_div = $("<div>"); var new_label = $("<label>"); new_label.html(this.City); var new_input_b = $("<input>"); new_input_b.attr("type", "button"); new_div.append(new_label); new_div.append(new_input_b); $("#institutions").append(new_div); }); }); }); $("#institutions") is a div where i want to display all of the results. .post works correct for sure because certain institutions are retrieved from database, and passed to the view as Json result. But then I am affraid it wont itterate with .each. Any help, coments or pointing in some direction would be much appriciated

    Read the article

  • JTable data only shown after scrolling

    - by Christian 'fuzi' Orgler
    I wrote a method, that creates my DefaultTableModel and there I'm going to add my records. When I set the model to my JTable, the data rows are blank. After scrolling the data gets displayed correct. How can I avoid this and display the data from the first moment? EDIT: I imported the javax.swing.table.DefaultTableModel -- is this correct? private DefaultTableModel _dtm; private void loadTable(Vector<Member> members) { loadTableModel(); try { lbl_state.setText("Please wait"); for (Member actMember : members) { String gender = ""; if (actMember.getGender() == MemberView.MEMBER_MALE) { gender = "männlich"; } else { gender = "weiblich"; } _dtm.addRow(new Object[]{ actMember.getNname(), actMember.getVname(), actMember.getCity(), actMember.getStreet(), actMember.getPlz(), actMember.getMail(), actMember.getPhonenumber(), actMember.getBirthdayString(), actMember.getStartDateString(), gender, actMember.getBankname(), actMember.getAccountnumber(), actMember.getBanknumber(), actMember.getGroup().toString(), (actMember.hasAccess() ? "JA" : "NEIN"), actMember.getWriteDateString(), (actMember.hasDrinkAbo() ? "JA" : "NEIN") }); } } catch (Exception ex) { System.err.println(ex.getMessage()); } tbl_results.setModel(_dtm); } private void loadTableModel() { _dtm = new DefaultTableModel(new Object[]{"Nachname", "Vorname", "Ort", "Straße", "PLZ", "E-Mail", "Telefon", "Geburtsdatum", "Beitrittsdatum", "Geschlecht", "Bankname", "Kontonummer", "Bankleitzahl", "Gruppe", "hat Zugriff", "Einschreibdatum", "Getränkeabo"}, 0); tbl_results.setAutoResizeMode(JTable.AUTO_RESIZE_ALL_COLUMNS); }

    Read the article

  • NSFetchedResultsController on secondary UITableView - how to query data?

    - by Jason
    I am creating a core-data based Navigation iPhone app with multiple screens. Let's say it is a flash-card application. The data model is very simple, with only two entities: Language, and CardSet. There is a one-to-many relationship between the Language entity and the CardSet entities, so each Language may contain multiple CardSets. In other words, Language has a one-to-many relationship Language.cardSets which points to the list of CardSets, and CardSet has a relationship CardSet.language which points to the Language. There are two screens: (1) An initial TableView screen, which displays the list of languages; and (2) a secondary TableView screen, which displays the list of CardSets in the Language. In the initial screen, which lists the languages, I am using NSFetchedResultsController to keep the list of languages up-to-date. The screen passes the Language selected to the secondary screen. On the secondary screen, I am trying to figure out whether I should again use an NSFetchedResultsController to maintain the list of CardSets, or if I should work through Language.cardSets to simply pull the list out of the object model. The latter makes the most sense programatically because I already have the Language - but then it would not automatically be updated on changes. I have looked at the NSFetchedResultsController documentation, and it seems like I can easily create predicates based on attributes - but not relationships. I.e., I can create the following NSFetchedResultsController: NSPredicate *predicate = [NSPredicate predicateWithFormat:@"name LIKE[c] 'Chuck Norris'"]; How can I access my data through the direct relationship - Language.cardSets - and also have the table auto-update using NSFetchedResultsController? Is this possible?

    Read the article

  • Uniquing with Existing Core Data Entities

    - by warrenm
    I'm using Core Data to store a lot (1000s) of items. A pair of properties on each item are used to determine uniqueness, so when a new item comes in, I compare it against the existing items before inserting it. Since the incoming data is in the form of an RSS feed, there are often many duplicates, and the cost of the uniquing step is O(N^2), which has become significant. Right now, I create a set of existing items before iterating over the list of (possible) new items. My theory is that on the first iteration, all the items will be faulted in, and assuming we aren't pressed for memory, most of those items will remain resident over the course of the iteration. I see my options thusly: Use string comparison for uniquing, iterating over all "new" items and comparing to all existing items (Current approach) Use a predicate to filter the set of existing items against the properties of the "new" items. Use a predicate with Core Data to determine uniqueness of each "new" item (without retrieving the set of existing items). Is option 3 likely to be faster than my current approach? Do you know of a better way?

    Read the article

  • Import & modify date data in MATLAB

    - by niko
    I have a .csv file with records written in the following form: 2010-04-20 15:15:00,"8.9915176259e+00","8.8562623697e+00" 2010-04-20 15:30:00,"8.5718021723e+00","8.6633827160e+00" 2010-04-20 15:45:00,"8.4484844117e+00","8.4336586330e+00" 2010-04-20 16:00:00,"1.1106980342e+01","8.4333062208e+00" 2010-04-20 16:15:00,"9.0643470589e+00","8.6885660103e+00" 2010-04-20 16:30:00,"8.2133517943e+00","8.2677822671e+00" 2010-04-20 16:45:00,"8.2499419380e+00","8.1523501983e+00" 2010-04-20 17:00:00,"8.2948492278e+00","8.2884797924e+00" From these data I would like to make clusters - I would like to add a column with number indicating the hour - so in case of the first row a value 15 has to be added in a new row. The first problem is that calling a function [numData, textData, rawData] = xlsread('testData.csv') creates an empty matrix numData and one-column textData and rawData structures. Is it possible to create any template which recognizes a yyyy, MM, dd, hh, mm, ss values from the data above? What I would basically like to do with these data is to categorize the values by hours so from the example row of input: 2010-04-20 15:15:00,"8.9915176259e+00","8.8562623697e+00" update 1: in Matlab the line above is recognized as a string: '2010-04-26 13:00:00,"1.0428104753e+00","2.3456394130e+00"' I would want this to be the output: 15, 8.9915176259e+00, 8.8562623697e+00 update 1: a string has to be parsed Does anyone know how to parse a string and retrieve a timestamp, value1 (1.0428104753e+00) and value2 (2.3456394130e+00) from it as separate values?

    Read the article

  • Can't find momd file: Core Data problems

    - by thekevinscott
    Aw geez! I screwed something up! I'm a Core Data noob, working on my first iOS app. After much Stack Overflowing I'm using this code: NSString *path = [[NSBundle mainBundle] pathForResource:@"CoreData" ofType:@"momd"]; if (!path) { path = [[NSBundle mainBundle] pathForResource:@"CoreData" ofType:@"mom"]; } NSAssert(path != nil, @"Unable to find Resource in main bundle"); CoreData is the name of my app. I've tried to put in initial data into the app by finding the path to the sqlite file in my iPhone simulator, and then going and inserting into that sqlite file. But at some point, I moved the sqlite (thinking it would create a fresh copy), deleted the app from the simulator, and the sqlite file is gone. I'm not sure if I'm leaving out some part of the process (this was a few hours ago) but the end result is that everything is screwed up. How do I resubstantiate this sqlite / momd file? "Clean" and "Clean all targets" are grayed out. I'm happy to post the relevant code from my app that would help shed some light on this problem but there's tons of code relating to Core Data which I don't understand, so I'm not sure what part to post! Any help is greatly appreciated.

    Read the article

  • Multidimensional data structure?

    - by Austin Truong
    I need a multidimensional data structure with a row and a column. Must be able to insert elements any location in the data structure. Example: {A , B} I want to insert C in between A and B. {A, C, B}. Dynamic: I do not know the size of the data structure. Another example: I know the [row][col] of where I want to insert the element. EX. insert("A", 1, 5), where A is the element to be inserted, 1 is the row, 5 is the column. EDIT I want to be able to insert like this. static void Main(string[] args) { Program p = new Program(); List<string> list = new List<string>(); list.Insert(1, "HELLO"); list.Insert(5, "RAWR"); for (int i = 0; i < list.Count; i++) { Console.WriteLine(list[i]); } Console.ReadKey(); } And of course this crashes with an out of bounds error. So in a sense I will have a user who will choose which ROW and COL to insert the element to.

    Read the article

  • WCF data services - Limiting related objects returned based on critera

    - by Mike Morley
    I have an object graph consisting of a base employee object, and a set of related message objects. I am able to return the employee objects based on search criteria on the employee properties (eg team) etc. However, if I expand on the messages, I get the full collection of messages back. I would like to be able to either take the top n messages (i.e. restrict to 10 most recent) or ideally use a date range on the message objects to limit how many are brought back. So far I have not been able to figure out a way of doing this: I get an error if I attempt to filter on properties on the message (&$filter=employee/message/StartDate gives an error "No property 'StartDate' exists in type 'System.Data.Objects.DataClasses.EntityCollection`1). Attempting to use Top on the message related object doesn't work either. I have also tried using a WebGet extension that takes a string list of employee IDs. That works until the list gets too long, and then fails due to the URL getting too long (it might be possible to setup a paging mechanism on this approach)... Unfortunately the UI control I am using requires the data to be in a fairly specific hierarchical shape, so I can't easily come at this from starting on the message side and working backwards. Outside of making multiple calls does anyone know of a method to accomplish this with wcf data services? Thanks! M.

    Read the article

  • Compact data structure for storing a large set of integral values

    - by Odrade
    I'm working on an application that needs to pass around large sets of Int32 values. The sets are expected to contain ~1,000,000-50,000,000 items, where each item is a database key in the range 0-50,000,000. I expect distribution of ids in any given set to be effectively random over this range. The operations I need on the set are dirt simple: Add a new value Iterate over all of the values. There is a serious concern about the memory usage of these sets, so I'm looking for a data structure that can store the ids more efficiently than a simple List<int>or HashSet<int>. I've looked at BitArray, but that can be wasteful depending on how sparse the ids are. I've also considered a bitwise trie, but I'm unsure how to calculate the space efficiency of that solution for the expected data. A Bloom Filter would be great, if only I could tolerate the false negatives. I would appreciate any suggestions of data structures suitable for this purpose. I'm interested in both out-of-the-box and custom solutions. EDIT: To answer your questions: No, the items don't need to be sorted By "pass around" I mean both pass between methods and serialize and send over the wire. I clearly should have mentioned this. There could be a decent number of these sets in memory at once (~100).

    Read the article

  • Display data FROM Sheet1 on Sheet2 based on conditional value

    - by Shawn
    Imagine a worksheet with 30 pieces of information, such as: A1= Start Date A2 = End Date A3 = Resource Name A4 = Cost .... A30 = Whatever B1 = 1/1/2010 B2 = 2/15/2010 B3 = Joe Smith B4 = $10,000.00 ... B30 = Blah Blah Now imagine a third column, C. The purpose of the third column is to determine WHICH report that row of data needs to appear in. C1 = Report 1 C2 = Report 1 and Report 2 C3 = Report 4 and Report 7 C4 = Report 1 and Report 5 ... C30 = Report 2 Each report is on Sheets 2, 3, 4, 5 and so on (depending on how many I decide to create). As you can see form my example above, some data may need to appear in multiple reports. For example, the data in Row 3 (Resource Name: Joe Smith) needs to appear in Report 4 and Report 7. That is to say, it needs to DYNAMICALLY appear on two additional worksheets. If I change the values in column C, then the reports need to update automatically. How can I create the worksheets which will serve as the reports such that they only diaplay the rows which have been "flagged" to be displayed in that report? Thanks!

    Read the article

  • Installation error: INSTALL_FAILED_OLDER_SDK in eclipse

    - by user3014909
    I have an unexpe`ted problem with my Android project. I have a real android device with ice_cream sandwich installed. My app was working fine during the development but after I added a class to the project, I got an error: Installation error: INSTALL_FAILED_OLDER_SDK The problem is that everything is good in the manifest file. The minSdkversion is 8. Here is my manifest file: <?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="zabolotnii.pavel.timer" android:versionCode="1" android:versionName="1.0" > <uses-sdk android:minSdkVersion="8" android:targetSdkVersion="18 " /> <application android:allowBackup="true" android:icon="@drawable/ic_launcher" android:label="@string/app_name" android:theme="@style/AppTheme" > <activity android:name="zabolotnii.pavel.timer.TimerActivity" android:label="@string/app_name" > <intent-filter> <action android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> </activity> </application> </manifest> I don't know, if there is any need to attach the new class ,but I didn't any changes to other code that should led to this error: package zabolotnii.pavel.timer; import android.app.AlertDialog; import android.content.Context; import android.content.DialogInterface; import android.graphics.Paint; import android.graphics.Point; import android.graphics.Rect; import android.graphics.drawable.Drawable; import android.os.Environment; import android.util.DisplayMetrics; import android.util.TypedValue; import android.view.*; import android.widget.*; import java.io.File; import java.io.FilenameFilter; import java.util.*; public class OpenFileDialog extends AlertDialog.Builder { private String currentPath = Environment.getExternalStorageDirectory().getPath(); private List<File> files = new ArrayList<File>(); private TextView title; private ListView listView; private FilenameFilter filenameFilter; private int selectedIndex = -1; private OpenDialogListener listener; private Drawable folderIcon; private Drawable fileIcon; private String accessDeniedMessage; public interface OpenDialogListener { public void OnSelectedFile(String fileName); } private class FileAdapter extends ArrayAdapter<File> { public FileAdapter(Context context, List<File> files) { super(context, android.R.layout.simple_list_item_1, files); } @Override public View getView(int position, View convertView, ViewGroup parent) { TextView view = (TextView) super.getView(position, convertView, parent); File file = getItem(position); if (view != null) { view.setText(file.getName()); if (file.isDirectory()) { setDrawable(view, folderIcon); } else { setDrawable(view, fileIcon); if (selectedIndex == position) view.setBackgroundColor(getContext().getResources().getColor(android.R.color.holo_blue_dark)); else view.setBackgroundColor(getContext().getResources().getColor(android.R.color.transparent)); } } return view; } private void setDrawable(TextView view, Drawable drawable) { if (view != null) { if (drawable != null) { drawable.setBounds(0, 0, 60, 60); view.setCompoundDrawables(drawable, null, null, null); } else { view.setCompoundDrawables(null, null, null, null); } } } } public OpenFileDialog(Context context) { super(context); title = createTitle(context); changeTitle(); LinearLayout linearLayout = createMainLayout(context); linearLayout.addView(createBackItem(context)); listView = createListView(context); linearLayout.addView(listView); setCustomTitle(title) .setView(linearLayout) .setPositiveButton(android.R.string.ok, new DialogInterface.OnClickListener() { @Override public void onClick(DialogInterface dialog, int which) { if (selectedIndex > -1 && listener != null) { listener.OnSelectedFile(listView.getItemAtPosition(selectedIndex).toString()); } } }) .setNegativeButton(android.R.string.cancel, null); } @Override public AlertDialog show() { files.addAll(getFiles(currentPath)); listView.setAdapter(new FileAdapter(getContext(), files)); return super.show(); } public OpenFileDialog setFilter(final String filter) { filenameFilter = new FilenameFilter() { @Override public boolean accept(File file, String fileName) { File tempFile = new File(String.format("%s/%s", file.getPath(), fileName)); if (tempFile.isFile()) return tempFile.getName().matches(filter); return true; } }; return this; } public OpenFileDialog setOpenDialogListener(OpenDialogListener listener) { this.listener = listener; return this; } public OpenFileDialog setFolderIcon(Drawable drawable) { this.folderIcon = drawable; return this; } public OpenFileDialog setFileIcon(Drawable drawable) { this.fileIcon = drawable; return this; } public OpenFileDialog setAccessDeniedMessage(String message) { this.accessDeniedMessage = message; return this; } private static Display getDefaultDisplay(Context context) { return ((WindowManager) context.getSystemService(Context.WINDOW_SERVICE)).getDefaultDisplay(); } private static Point getScreenSize(Context context) { Point screeSize = new Point(); getDefaultDisplay(context).getSize(screeSize); return screeSize; } private static int getLinearLayoutMinHeight(Context context) { return getScreenSize(context).y; } private LinearLayout createMainLayout(Context context) { LinearLayout linearLayout = new LinearLayout(context); linearLayout.setOrientation(LinearLayout.VERTICAL); linearLayout.setMinimumHeight(getLinearLayoutMinHeight(context)); return linearLayout; } private int getItemHeight(Context context) { TypedValue value = new TypedValue(); DisplayMetrics metrics = new DisplayMetrics(); context.getTheme().resolveAttribute(android.R.attr.listPreferredItemHeightSmall, value, true); getDefaultDisplay(context).getMetrics(metrics); return (int) TypedValue.complexToDimension(value.data, metrics); } private TextView createTextView(Context context, int style) { TextView textView = new TextView(context); textView.setTextAppearance(context, style); int itemHeight = getItemHeight(context); textView.setLayoutParams(new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT, itemHeight)); textView.setMinHeight(itemHeight); textView.setGravity(Gravity.CENTER_VERTICAL); textView.setPadding(15, 0, 0, 0); return textView; } private TextView createTitle(Context context) { TextView textView = createTextView(context, android.R.style.TextAppearance_DeviceDefault_DialogWindowTitle); return textView; } private TextView createBackItem(Context context) { TextView textView = createTextView(context, android.R.style.TextAppearance_DeviceDefault_Small); Drawable drawable = getContext().getResources().getDrawable(android.R.drawable.ic_menu_directions); drawable.setBounds(0, 0, 60, 60); textView.setCompoundDrawables(drawable, null, null, null); textView.setLayoutParams(new ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT, ViewGroup.LayoutParams.WRAP_CONTENT)); textView.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View view) { File file = new File(currentPath); File parentDirectory = file.getParentFile(); if (parentDirectory != null) { currentPath = parentDirectory.getPath(); RebuildFiles(((FileAdapter) listView.getAdapter())); } } }); return textView; } public int getTextWidth(String text, Paint paint) { Rect bounds = new Rect(); paint.getTextBounds(text, 0, text.length(), bounds); return bounds.left + bounds.width() + 80; } private void changeTitle() { String titleText = currentPath; int screenWidth = getScreenSize(getContext()).x; int maxWidth = (int) (screenWidth * 0.99); if (getTextWidth(titleText, title.getPaint()) > maxWidth) { while (getTextWidth("..." + titleText, title.getPaint()) > maxWidth) { int start = titleText.indexOf("/", 2); if (start > 0) titleText = titleText.substring(start); else titleText = titleText.substring(2); } title.setText("..." + titleText); } else { title.setText(titleText); } } private List<File> getFiles(String directoryPath) { File directory = new File(directoryPath); List<File> fileList = Arrays.asList(directory.listFiles(filenameFilter)); Collections.sort(fileList, new Comparator<File>() { @Override public int compare(File file, File file2) { if (file.isDirectory() && file2.isFile()) return -1; else if (file.isFile() && file2.isDirectory()) return 1; else return file.getPath().compareTo(file2.getPath()); } }); return fileList; } private void RebuildFiles(ArrayAdapter<File> adapter) { try { List<File> fileList = getFiles(currentPath); files.clear(); selectedIndex = -1; files.addAll(fileList); adapter.notifyDataSetChanged(); changeTitle(); } catch (NullPointerException e) { String message = getContext().getResources().getString(android.R.string.unknownName); if (!accessDeniedMessage.equals("")) message = accessDeniedMessage; Toast.makeText(getContext(), message, Toast.LENGTH_SHORT).show(); } } private ListView createListView(Context context) { ListView listView = new ListView(context); listView.setOnItemClickListener(new AdapterView.OnItemClickListener() { @Override public void onItemClick(AdapterView<?> adapterView, View view, int index, long l) { final ArrayAdapter<File> adapter = (FileAdapter) adapterView.getAdapter(); File file = adapter.getItem(index); if (file.isDirectory()) { currentPath = file.getPath(); RebuildFiles(adapter); } else { if (index != selectedIndex) selectedIndex = index; else selectedIndex = -1; adapter.notifyDataSetChanged(); } } }); return listView; } }

    Read the article

  • Python, web log data mining for frequent patterns

    - by descent
    Hello! I need to develop a tool for web log data mining. Having many sequences of urls, requested in a particular user session (retrieved from web-application logs), I need to figure out the patterns of usage and groups (clusters) of users of the website. I am new to Data Mining, and now examining Google a lot. Found some useful info, i.e. querying Frequent Pattern Mining in Web Log Data seems to point to almost exactly similar studies. So my questions are: Are there any python-based tools that do what I need or at least smth similar? Can Orange toolkit be of any help? Can reading the book Programming Collective Intelligence be of any help? What to Google for, what to read, which relatively simple algorithms to use best? I am very limited in time (to around a week), so any help would be extremely precious. What I need is to point me into the right direction and the advice of how to accomplish the task in the shortest time. Thanks in advance!

    Read the article

  • Data Integration Solution?

    - by Shlomo
    At my company we have a number of data feeds and processing that run on any given day. The number of feeds and processing steps is starting to out-number the ability to manage it ad-hoc as it is managed currently. Is there a good solution that helps with logging and managing/scheduling dependencies? For example: A: When file x is FTP dropped into directory D1, kick off processing step B B: Load flat file into DB1 C: When file y is FTP dropped into directory D2, kick off processing Step D D: Load flat file into DB11 E: When B and D are done, churn through the data, and load new data into DB111. F: When Step E is done, launch application process P G: etc... I want those steps to run at the appropriate times, not to mention if B fails, there's no reason to run steps E & F, but I could still run C & D. When I re-run B successfully, it should trigger just E & F to re-run, not C & D. We're a .NET/C#/Sql Server shop, and I'm already familiar with SSIS. Is that really the best there is? That manages steps well, but not external dependencies, or logging. Open source (.NET) preferred, but not required.

    Read the article

  • Save many-to-one relationship from JSON into Core Data

    - by Snow Crash
    I'm wanting to save a Many-to-one relationship parsed from JSON into Core Data. The code that parses the JSON and does the insert into Core Data looks like this: for (NSDictionary *thisRecipe in recipes) { Recipe *recipe = [NSEntityDescription insertNewObjectForEntityForName:@"Recipe" inManagedObjectContext:insertionContext]; recipe.title = [thisRecipe objectForKey:@"Title"]; NSDictionary *ingredientsForRecipe = [thisRecipe objectForKey:@"Ingredients"]; NSArray *ingredientsArray = [ingredientsForRecipe objectForKey:@"Results"]; for (NSDictionary *thisIngredient in ingredientsArray) { Ingredient *ingredient = [NSEntityDescription insertNewObjectForEntityForName:@"Ingredient" inManagedObjectContext:insertionContext]; ingredient.name = [thisIngredient objectForKey:@"Name"]; } } NSSet *ingredientsSet = [NSSet ingredientsArray]; [recipe setIngredients:ingredientsSet]; Notes: "setIngredients" is a Core Data generated accessor method. There is a many-to-one relationship between Ingredients and Recipe However, when I run this I get the following error: NSCFDictionary managedObjectContext]: unrecognized selector sent to instance If I remove the last line (i.e. [recipe setIngredients:ingredientsSet];) then, taking a peek at the SQLite database, I see the Recipe and Ingredients have been stored but no relationship has been created between Recipe and Ingredients Any suggestions as to how to ensure the relationship is stored correctly?

    Read the article

  • Parallelism in .NET – Part 2, Simple Imperative Data Parallelism

    - by Reed
    In my discussion of Decomposition of the problem space, I mentioned that Data Decomposition is often the simplest abstraction to use when trying to parallelize a routine.  If a problem can be decomposed based off the data, we will often want to use what MSDN refers to as Data Parallelism as our strategy for implementing our routine.  The Task Parallel Library in .NET 4 makes implementing Data Parallelism, for most cases, very simple. Data Parallelism is the main technique we use to parallelize a routine which can be decomposed based off dataData Parallelism refers to taking a single collection of data, and having a single operation be performed concurrently on elements in the collection.  One side note here: Data Parallelism is also sometimes referred to as the Loop Parallelism Pattern or Loop-level Parallelism.  In general, for this series, I will try to use the terminology used in the MSDN Documentation for the Task Parallel Library.  This should make it easier to investigate these topics in more detail. Once we’ve determined we have a problem that, potentially, can be decomposed based on data, implementation using Data Parallelism in the TPL is quite simple.  Let’s take our example from the Data Decomposition discussion – a simple contrast stretching filter.  Here, we have a collection of data (pixels), and we need to run a simple operation on each element of the pixel.  Once we know the minimum and maximum values, we most likely would have some simple code like the following: for (int row=0; row < pixelData.GetUpperBound(0); ++row) { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This simple routine loops through a two dimensional array of pixelData, and calls the AdjustContrast routine on each pixel. As I mentioned, when you’re decomposing a problem space, most iteration statements are potentially candidates for data decomposition.  Here, we’re using two for loops – one looping through rows in the image, and a second nested loop iterating through the columns.  We then perform one, independent operation on each element based on those loop positions. This is a prime candidate – we have no shared data, no dependencies on anything but the pixel which we want to change.  Since we’re using a for loop, we can easily parallelize this using the Parallel.For method in the TPL: Parallel.For(0, pixelData.GetUpperBound(0), row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); Here, by simply changing our first for loop to a call to Parallel.For, we can parallelize this portion of our routine.  Parallel.For works, as do many methods in the TPL, by creating a delegate and using it as an argument to a method.  In this case, our for loop iteration block becomes a delegate creating via a lambda expression.  This lets you write code that, superficially, looks similar to the familiar for loop, but functions quite differently at runtime. We could easily do this to our second for loop as well, but that may not be a good idea.  There is a balance to be struck when writing parallel code.  We want to have enough work items to keep all of our processors busy, but the more we partition our data, the more overhead we introduce.  In this case, we have an image of data – most likely hundreds of pixels in both dimensions.  By just parallelizing our first loop, each row of pixels can be run as a single task.  With hundreds of rows of data, we are providing fine enough granularity to keep all of our processors busy. If we parallelize both loops, we’re potentially creating millions of independent tasks.  This introduces extra overhead with no extra gain, and will actually reduce our overall performance.  This leads to my first guideline when writing parallel code: Partition your problem into enough tasks to keep each processor busy throughout the operation, but not more than necessary to keep each processor busy. Also note that I parallelized the outer loop.  I could have just as easily partitioned the inner loop.  However, partitioning the inner loop would have led to many more discrete work items, each with a smaller amount of work (operate on one pixel instead of one row of pixels).  My second guideline when writing parallel code reflects this: Partition your problem in a way to place the most work possible into each task. This typically means, in practice, that you will want to parallelize the routine at the “highest” point possible in the routine, typically the outermost loop.  If you’re looking at parallelizing methods which call other methods, you’ll want to try to partition your work high up in the stack – as you get into lower level methods, the performance impact of parallelizing your routines may not overcome the overhead introduced. Parallel.For works great for situations where we know the number of elements we’re going to process in advance.  If we’re iterating through an IList<T> or an array, this is a typical approach.  However, there are other iteration statements common in C#.  In many situations, we’ll use foreach instead of a for loop.  This can be more understandable and easier to read, but also has the advantage of working with collections which only implement IEnumerable<T>, where we do not know the number of elements involved in advance. As an example, lets take the following situation.  Say we have a collection of Customers, and we want to iterate through each customer, check some information about the customer, and if a certain case is met, send an email to the customer and update our instance to reflect this change.  Normally, this might look something like: foreach(var customer in customers) { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } } Here, we’re doing a fair amount of work for each customer in our collection, but we don’t know how many customers exist.  If we assume that theStore.GetLastContact(customer) and theStore.EmailCustomer(customer) are both side-effect free, thread safe operations, we could parallelize this using Parallel.ForEach: Parallel.ForEach(customers, customer => { // Run some process that takes some time... DateTime lastContact = theStore.GetLastContact(customer); TimeSpan timeSinceContact = DateTime.Now - lastContact; // If it's been more than two weeks, send an email, and update... if (timeSinceContact.Days > 14) { theStore.EmailCustomer(customer); customer.LastEmailContact = DateTime.Now; } }); Just like Parallel.For, we rework our loop into a method call accepting a delegate created via a lambda expression.  This keeps our new code very similar to our original iteration statement, however, this will now execute in parallel.  The same guidelines apply with Parallel.ForEach as with Parallel.For. The other iteration statements, do and while, do not have direct equivalents in the Task Parallel Library.  These, however, are very easy to implement using Parallel.ForEach and the yield keyword. Most applications can benefit from implementing some form of Data Parallelism.  Iterating through collections and performing “work” is a very common pattern in nearly every application.  When the problem can be decomposed by data, we often can parallelize the workload by merely changing foreach statements to Parallel.ForEach method calls, and for loops to Parallel.For method calls.  Any time your program operates on a collection, and does a set of work on each item in the collection where that work is not dependent on other information, you very likely have an opportunity to parallelize your routine.

    Read the article

  • ERROR: Attempted to read or write protected memory. This is often an indication that other memory is corrupt

    - by SPSamL
    I get this error after having edited a few pages in SharePoint 2010. I have to do an IISReset on both front ends to get this to resolve. I don't know how to fix it or even what else to supply here, but please let me know as the resets now happen several times per day. Log Name: Application Source: ASP.NET 2.0.50727.0 Date: 1/26/2011 11:12:48 AM Event ID: 1309 Task Category: Web Event Level: Warning Keywords: Classic User: N/A Computer: PINTSPSFE02.samcstl.org Description: Event code: 3005 Event message: An unhandled exception has occurred. Event time: 1/26/2011 11:12:48 AM Event time (UTC): 1/26/2011 5:12:48 PM Event ID: c52fb336b7f147a3913fff3617a99d57 Event sequence: 4965 Event occurrence: 2178 Event detail code: 0 Application information: Application domain: /LM/W3SVC/1449762715/ROOT-2-129405348166941887 Trust level: WSS_Minimal Application Virtual Path: / Application Path: C:\inetpub\wwwroot\wss\VirtualDirectories\80\ Machine name: PINTSPSFE02 Process information: Process ID: 5928 Process name: w3wp.exe Account name: SAMC\MossAppPool Exception information: Exception type: AccessViolationException Exception message: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. Request information: Request URL: http://mosscluster/Pages/Home.aspx Request path: /Pages/Home.aspx User host address: 10.3.60.26 User: SAMC\BARNMD Is authenticated: True Authentication Type: NTLM Thread account name: SAMC\MossAppPool Thread information: Thread ID: 110 Thread account name: SAMC\MossAppPool Is impersonating: False Stack trace: at Microsoft.Office.Server.ObjectCache.SPCache.MossObjectCache_Tracked.Delete(String key, Boolean recursive, DeletionReason reason) at Microsoft.Office.Server.ObjectCache.SPCache.MossObjectCache_Tracked.Get(String key) at Microsoft.Office.Server.ObjectCache.SPCache.Get(String objectTypeName, String id) at Microsoft.Office.Server.Administration.UserProfileServiceProxy.GetPartitionPropertiesCache(Guid applicationID) at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.get_PartitionPropertiesCache() at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.DataCache.get_PartitionProperties() at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.GetMySitePortalUrl(SPUrlZone zone, Guid partitionID) at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.GetMySitePortalUrl(SPUrlZone zone, SPServiceContext serviceContext) at Microsoft.Office.Server.WebControls.MyLinksRibbon.EnsureMySiteUrls() at Microsoft.Office.Server.WebControls.MyLinksRibbon.get_PortalMySiteUrlAvailable() at Microsoft.Office.Server.WebControls.MyLinksRibbon.OnLoad(EventArgs e) at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) Custom event details: Event Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="ASP.NET 2.0.50727.0" /> <EventID Qualifiers="32768">1309</EventID> <Level>3</Level> <Task>3</Task> <Keywords>0x80000000000000</Keywords> <TimeCreated SystemTime="2011-01-26T17:12:48.000000000Z" /> <EventRecordID>35834</EventRecordID> <Channel>Application</Channel> <Computer>PINTSPSFE02.samcstl.org</Computer> <Security /> </System> <EventData> <Data>3005</Data> <Data>An unhandled exception has occurred.</Data> <Data>1/26/2011 11:12:48 AM</Data> <Data>1/26/2011 5:12:48 PM</Data> <Data>c52fb336b7f147a3913fff3617a99d57</Data> <Data>4965</Data> <Data>2178</Data> <Data>0</Data> <Data>/LM/W3SVC/1449762715/ROOT-2-129405348166941887</Data> <Data>WSS_Minimal</Data> <Data>/</Data> <Data>C:\inetpub\wwwroot\wss\VirtualDirectories\80\</Data> <Data>PINTSPSFE02</Data> <Data> </Data> <Data>5928</Data> <Data>w3wp.exe</Data> <Data>SAMC\MossAppPool</Data> <Data>AccessViolationException</Data> <Data></Data> <Data>http://mosscluster/Pages/Home.aspx</Data> <Data>/Pages/Home.aspx</Data> <Data>10.3.60.26</Data> <Data>SAMC\BARNMD</Data> <Data>True</Data> <Data>NTLM</Data> <Data>SAMC\MossAppPool</Data> <Data>110</Data> <Data>SAMC\MossAppPool</Data> <Data>False</Data> <Data> at Microsoft.Office.Server.ObjectCache.SPCache.MossObjectCache_Tracked.Delete(String key, Boolean recursive, DeletionReason reason) at Microsoft.Office.Server.ObjectCache.SPCache.MossObjectCache_Tracked.Get(String key) at Microsoft.Office.Server.ObjectCache.SPCache.Get(String objectTypeName, String id) at Microsoft.Office.Server.Administration.UserProfileServiceProxy.GetPartitionPropertiesCache(Guid applicationID) at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.get_PartitionPropertiesCache() at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.DataCache.get_PartitionProperties() at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.GetMySitePortalUrl(SPUrlZone zone, Guid partitionID) at Microsoft.Office.Server.Administration.UserProfileApplicationProxy.GetMySitePortalUrl(SPUrlZone zone, SPServiceContext serviceContext) at Microsoft.Office.Server.WebControls.MyLinksRibbon.EnsureMySiteUrls() at Microsoft.Office.Server.WebControls.MyLinksRibbon.get_PortalMySiteUrlAvailable() at Microsoft.Office.Server.WebControls.MyLinksRibbon.OnLoad(EventArgs e) at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Control.LoadRecursive() at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) </Data> </EventData> </Event>

    Read the article

  • Maintaining shared service in ASP.NET MVC Application

    - by kazimanzurrashid
    Depending on the application sometimes we have to maintain some shared service throughout our application. Let’s say you are developing a multi-blog supported blog engine where both the controller and view must know the currently visiting blog, it’s setting , user information and url generation service. In this post, I will show you how you can handle this kind of case in most convenient way. First, let see the most basic way, we can create our PostController in the following way: public class PostController : Controller { public PostController(dependencies...) { } public ActionResult Index(string blogName, int? page) { BlogInfo blog = blogSerivce.FindByName(blogName); if (blog == null) { return new NotFoundResult(); } IEnumerable<PostInfo> posts = postService.FindPublished(blog.Id, PagingCalculator.StartIndex(page, blog.PostPerPage), blog.PostPerPage); int count = postService.GetPublishedCount(blog.Id); UserInfo user = null; if (HttpContext.User.Identity.IsAuthenticated) { user = userService.FindByName(HttpContext.User.Identity.Name); } return View(new IndexViewModel(urlResolver, user, blog, posts, count, page)); } public ActionResult Archive(string blogName, int? page, ArchiveDate archiveDate) { BlogInfo blog = blogSerivce.FindByName(blogName); if (blog == null) { return new NotFoundResult(); } IEnumerable<PostInfo> posts = postService.FindArchived(blog.Id, archiveDate, PagingCalculator.StartIndex(page, blog.PostPerPage), blog.PostPerPage); int count = postService.GetArchivedCount(blog.Id, archiveDate); UserInfo user = null; if (HttpContext.User.Identity.IsAuthenticated) { user = userService.FindByName(HttpContext.User.Identity.Name); } return View(new ArchiveViewModel(urlResolver, user, blog, posts, count, page, achiveDate)); } public ActionResult Tag(string blogName, string tagSlug, int? page) { BlogInfo blog = blogSerivce.FindByName(blogName); if (blog == null) { return new NotFoundResult(); } TagInfo tag = tagService.FindBySlug(blog.Id, tagSlug); if (tag == null) { return new NotFoundResult(); } IEnumerable<PostInfo> posts = postService.FindPublishedByTag(blog.Id, tag.Id, PagingCalculator.StartIndex(page, blog.PostPerPage), blog.PostPerPage); int count = postService.GetPublishedCountByTag(tag.Id); UserInfo user = null; if (HttpContext.User.Identity.IsAuthenticated) { user = userService.FindByName(HttpContext.User.Identity.Name); } return View(new TagViewModel(urlResolver, user, blog, posts, count, page, tag)); } } As you can see the above code heavily depends upon the current blog and the blog retrieval code is duplicated in all of the action methods, once the blog is retrieved the same blog is passed in the view model. Other than the blog the view also needs the current user and url resolver to render it properly. One way to remove the duplicate blog retrieval code is to create a custom model binder which converts the blog from a blog name and use the blog a parameter in the action methods instead of the string blog name, but it only helps the first half in the above scenario, the action methods still have to pass the blog, user and url resolver etc in the view model. Now lets try to improve the the above code, first lets create a new class which would contain the shared services, lets name it as BlogContext: public class BlogContext { public BlogInfo Blog { get; set; } public UserInfo User { get; set; } public IUrlResolver UrlResolver { get; set; } } Next, we will create an interface, IContextAwareService: public interface IContextAwareService { BlogContext Context { get; set; } } The idea is, whoever needs these shared services needs to implement this interface, in our case both the controller and the view model, now we will create an action filter which will be responsible for populating the context: public class PopulateBlogContextAttribute : FilterAttribute, IActionFilter { private static string blogNameRouteParameter = "blogName"; private readonly IBlogService blogService; private readonly IUserService userService; private readonly BlogContext context; public PopulateBlogContextAttribute(IBlogService blogService, IUserService userService, IUrlResolver urlResolver) { Invariant.IsNotNull(blogService, "blogService"); Invariant.IsNotNull(userService, "userService"); Invariant.IsNotNull(urlResolver, "urlResolver"); this.blogService = blogService; this.userService = userService; context = new BlogContext { UrlResolver = urlResolver }; } public static string BlogNameRouteParameter { [DebuggerStepThrough] get { return blogNameRouteParameter; } [DebuggerStepThrough] set { blogNameRouteParameter = value; } } public void OnActionExecuting(ActionExecutingContext filterContext) { string blogName = (string) filterContext.Controller.ValueProvider.GetValue(BlogNameRouteParameter).ConvertTo(typeof(string), Culture.Current); if (!string.IsNullOrWhiteSpace(blogName)) { context.Blog = blogService.FindByName(blogName); } if (context.Blog == null) { filterContext.Result = new NotFoundResult(); return; } if (filterContext.HttpContext.User.Identity.IsAuthenticated) { context.User = userService.FindByName(filterContext.HttpContext.User.Identity.Name); } IContextAwareService controller = filterContext.Controller as IContextAwareService; if (controller != null) { controller.Context = context; } } public void OnActionExecuted(ActionExecutedContext filterContext) { Invariant.IsNotNull(filterContext, "filterContext"); if ((filterContext.Exception == null) || filterContext.ExceptionHandled) { IContextAwareService model = filterContext.Controller.ViewData.Model as IContextAwareService; if (model != null) { model.Context = context; } } } } As you can see we are populating the context in the OnActionExecuting, which executes just before the controllers action methods executes, so by the time our action methods executes the context is already populated, next we are are assigning the same context in the view model in OnActionExecuted method which executes just after we set the  model and return the view in our action methods. Now, lets change the view models so that it implements this interface: public class IndexViewModel : IContextAwareService { // More Codes } public class ArchiveViewModel : IContextAwareService { // More Codes } public class TagViewModel : IContextAwareService { // More Codes } and the controller: public class PostController : Controller, IContextAwareService { public PostController(dependencies...) { } public BlogContext Context { get; set; } public ActionResult Index(int? page) { IEnumerable<PostInfo> posts = postService.FindPublished(Context.Blog.Id, PagingCalculator.StartIndex(page, Context.Blog.PostPerPage), Context.Blog.PostPerPage); int count = postService.GetPublishedCount(Context.Blog.Id); return View(new IndexViewModel(posts, count, page)); } public ActionResult Archive(int? page, ArchiveDate archiveDate) { IEnumerable<PostInfo> posts = postService.FindArchived(Context.Blog.Id, archiveDate, PagingCalculator.StartIndex(page, Context.Blog.PostPerPage), Context.Blog.PostPerPage); int count = postService.GetArchivedCount(Context.Blog.Id, archiveDate); return View(new ArchiveViewModel(posts, count, page, achiveDate)); } public ActionResult Tag(string blogName, string tagSlug, int? page) { TagInfo tag = tagService.FindBySlug(Context.Blog.Id, tagSlug); if (tag == null) { return new NotFoundResult(); } IEnumerable<PostInfo> posts = postService.FindPublishedByTag(Context.Blog.Id, tag.Id, PagingCalculator.StartIndex(page, Context.Blog.PostPerPage), Context.Blog.PostPerPage); int count = postService.GetPublishedCountByTag(tag.Id); return View(new TagViewModel(posts, count, page, tag)); } } Now, the last thing where we have to glue everything, I will be using the AspNetMvcExtensibility to register the action filter (as there is no better way to inject the dependencies in action filters). public class RegisterFilters : RegisterFiltersBase { private static readonly Type controllerType = typeof(Controller); private static readonly Type contextAwareType = typeof(IContextAwareService); protected override void Register(IFilterRegistry registry) { TypeCatalog controllers = new TypeCatalogBuilder() .Add(GetType().Assembly) .Include(type => controllerType.IsAssignableFrom(type) && contextAwareType.IsAssignableFrom(type)); registry.Register<PopulateBlogContextAttribute>(controllers); } } Thoughts and Comments?

    Read the article

  • Data Profiling without SSIS

    Strangely enough for a predominantly SSIS blog, this post is all about how to perform data profiling without using SSIS. Whilst the Data Profiling Task is a worthy addition, there are a couple of limitations I’ve encountered of late. The first is that it requires SQL Server 2008, and not everyone is there yet. The second is that it can only target SQL Server 2005 and above. What about older systems, which are the ones that we probably need to investigate the most, or other vendor databases such as Oracle? With these limitations in mind I did some searching to find a quick and easy alternative to help me perform some data profiling for a project I was working on recently. I only had SQL Server 2005 available, and anyway most of my target source systems were Oracle, and of course I had short timescales. I looked at several options. Some never got beyond the download stage, they failed to install or just did not run, and others provided less than I could have produced myself by spending 2 minutes writing some basic SQL queries. In the end I settled on an open source product called DataCleaner. To quote from their website: DataCleaner is an Open Source application for profiling, validating and comparing data. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. DataCleaner is the free alternative to software for master data management (MDM) methodologies, data warehousing (DW) projects, statistical research, preparation for extract-transform-load (ETL) activities and more. DataCleaner is developed in Java and licensed under LGPL. As quoted above it claims to support profiling, validating and comparing data, but I didn’t really get past the profiling functions, so won’t comment on the other two. The profiling whilst not prefect certainly saved some time compared to the limited alternatives. The ability to profile heterogeneous data sources is a big advantage over the SSIS option, and I found it overall quite easy to use and performance was good. I could see it struggling at times, but actually for what it does I was impressed. It had some data type niggles with Oracle, and some metrics seem a little strange, although thankfully they were easy to augment with some SQL queries to ensure a consistent picture. The report export options didn’t do it for me, but copy and paste with a bit of Excel magic was sufficient. One initial point for me personally is that I have had limited exposure to things of the Java persuasion and whilst I normally get by fine, sometimes the simplest things can throw me. For example installing a JDBC driver, why do I have to copy files to make it all work, has nobody ever heard of an MSI? In case there are other people out there like me who have become totally indoctrinated with the Microsoft software paradigm, I’ve written a quick start guide that details every step required. Steps 1- 5 are the key ones, the rest is really an excuse for some screenshots to show you the tool. Quick Start Guide Step 1  - Download Data Cleaner. The Microsoft Windows zipped exe option, and I chose the latest stable build, currently DataCleaner 1.5.3 (final). Extract the files to a suitable location. Step 2 - Download Java. If you try and run datacleaner.exe without Java it will warn you, and then open your default browser and take you to the Java download site. Follow the installation instructions from there, normally just click Download Java a couple of times and you’re done. Step 3 - Download Microsoft SQL Server JDBC Driver. You may have SQL Server installed, but you won’t have a JDBC driver. Version 3.0 is the latest as of April 2010. There is no real installer, we are in the Java world here, but run the exe you downloaded to extract the files. The default Unzip to folder is not much help, so try a fully qualified path such as C:\Program Files\Microsoft SQL Server JDBC Driver 3.0\ to ensure you can find the files afterwards. Step 4 - If you wish to use Windows Authentication to connect to your SQL Server then first we need to copy a file so that Data Cleaner can find it. Browse to the JDBC extract location from Step 3 and drill down to the file sqljdbc_auth.dll. You will have to choose the correct directory for your processor architecture. e.g. C:\Program Files\Microsoft SQL Server JDBC Driver 3.0\sqljdbc_3.0\enu\auth\x86\sqljdbc_auth.dll. Now copy this file to the Data Cleaner extract folder you chose in Step 1. An alternative method is to edit datacleaner.cmd in the data cleaner extract folder as detailed in this data cleaner wiki topic, but I find copying the file simpler. Step 5 – Now lets run Data Cleaner, just run datacleaner.exe from the extract folder you chose in Step 1. Step 6 – Complete or skip the registration screen, and ignore the task window for now. In the main window click settings. Step 7 – In the Settings dialog, select the Database drivers tab, then click Register database driver and select the Local JAR file option. Step 8 – Browse to the JDBC driver extract location from Step 3 and drill down to select sqljdbc4.jar. e.g. C:\Program Files\Microsoft SQL Server JDBC Driver 3.0\sqljdbc_3.0\enu\sqljdbc4.jar Step 9 – Select the Database driver class as com.microsoft.sqlserver.jdbc.SQLServerDriver, and then click the Test and Save database driver button. Step 10 - You should be back at the Settings dialog with a the list of drivers that includes SQL Server. Just click Save Settings to persist all your hard work. Step 11 – Now we can start to profile some data. In the main Data Cleaner window click New Task, and then Profile from the task window. Step 12 – In the Profile window click Open Database Step 13 – Now choose the SQL Server connection string option. Selecting a connection string gives us a template like jdbc:sqlserver://<hostname>:1433;databaseName=<database>, but obviously it requires some details to be entered for example  jdbc:sqlserver://localhost:1433;databaseName=SQLBits. This will connect to the database called SQLBits on my local machine. The port may also have to be changed if using such as when you have a multiple instances of SQL Server running. If using SQL Server Authentication enter a username and password as required and then click Connect to database. You can use Window Authentication, just add integratedSecurity=true to the end of your connection string. e.g jdbc:sqlserver://localhost:1433;databaseName=SQLBits;integratedSecurity=true.  If you didn’t complete Step 4 above you will need to do so now and restart Data Cleaner before it will work. Manually setting the connection string is fine, but creating a named connection makes more sense if you will be spending any length of time profiling a specific database. As highlighted in the left-hand screen-shot, at the bottom of the dialog it includes partial instructions on how to create named connections. In the folder shown C:\Users\<Username>\.datacleaner\1.5.3, open the datacleaner-config.xml file in your editor of choice add your own details. You’ll see a sample connection in the file already, just add yours following the same pattern. e.g. <!-- Darren's Named Connections --> <bean class="dk.eobjects.datacleaner.gui.model.NamedConnection"> <property name="name" value="SQLBits Local Connection" /> <property name="driverClass" value="com.microsoft.sqlserver.jdbc.SQLServerDriver" /> <property name="connectionString" value="jdbc:sqlserver://localhost:1433;databaseName=SQLBits;integratedSecurity=true" /> <property name="tableTypes"> <list> <value>TABLE</value> <value>VIEW</value> </list> </property> </bean> Step 14 – Once back at the Profile window, you should now see your schemas, tables and/or views listed down the left hand side. Browse this tree and double-click a table to select it for profiling. You can then click Add profile, and choose some profiling options, before finally clicking Run profiling. You can see below a sample output for three of the most common profiles, click the image for full size.   I hope this has given you a taster for DataCleaner, and should help you get up and running pretty quickly.

    Read the article

  • Oracle Data Integration Solutions and the Oracle EXADATA Database Machine

    - by João Vilanova
    Oracle's data integration solutions provide a complete, open and integrated solution for building, deploying, and managing real-time data-centric architectures in operational and analytical environments. Fully integrated with and optimized for the Oracle Exadata Database Machine, Oracle's data integration solutions take data integration to the next level and delivers extremeperformance and scalability for all the enterprise data movement and transformation needs. Easy-to-use, open and standards-based Oracle's data integration solutions dramatically improve productivity, provide unparalleled efficiency, and lower the cost of ownership.You can watch a video about this subject, after clicking on the link below.DIS for EXADATA Video

    Read the article

  • BIWA Wednesday TechCast Series - Opposition to Data Warehouse Initiatives

    - by jenny.gelhausen
    BIWA Wednesday TechCast Series - 19th Event! Opposition to Data Warehouse Initiatives Please join us for this webcast on Wednesday, March 24, 12 noon Eastern or check your local area's time Webcast is open to clients, prospects and partners. No matter how good your technology and technical skills, organizational issues can derail a data warehousing or BI project. Therefore BIWA presents a vital topic that crosses product boundaries: organizational resistance to data warehouse initiatives - how to recognize it and what to do about it. Many a DW/BI professional has been surprised by organizational resistance to DW/BI initiatives. Yet real organizational imperatives may be behind this apparently irrational behavior. Based on in-depth interviews with IT professionals, industry consultants, and power users, our speaker Bruce Jenks will present his research findings about what drives organizational resistance to data warehouse initiatives. The talk will cover specific behaviors that can signal organizational resistance to a data warehouse program and what organizations have done to address such resistance. Presenter: Bruce Jenks of Dun and Bradstreet Bruce Jenks has over 20 years experience in data warehousing and business intelligence, much of it as a consultant to large organizations spanning the US. Bruce's data warehousing clients have included firms such as Sprint, Gallo Wines, Southern California Edison, The Gap, and Safeway. He started his data warehousing career at Metaphor Computers, a pioneering DW/BI firm from which a number of industry luminaries sprang including Ralph Kimball (author of The Data Warehouse Toolkit ). Bruce continued his data warehousing career at HP, Stanford University and other firms. Bruce is currently completing his doctorate in business administration at Golden Gate University, and today's material arises from his doctoral research. He is also a principal consultant for Dun and Bradstreet. Audio Dial-In: 866 682 4770 Audio Meeting ID: 1683901 Audio Meeting Passcode: 334451 Web Conference: Please register at https://www1.gotomeeting.com/register/807185273 After you register you will be provided with a link to the TechCast. Invitation to Speakers: All BIWA members and Oracle professionals (experts, end users, managers, DBAs, developers, data analysts, ISVs, partners, etc.) may submit abstracts for 45 minute technical webcasts to our Oracle BIWA (IOUG SIG) Community. Submit your BIWA TechCast abstract today! BIWA is a worldwide forum with over 2000 members who are business intelligence, warehousing and analytics professionals. BIWA presents information, experiences and best practices in successfully deploying Oracle Database-centric BI, Data Warehousing, and Analytics products, features and Options--the Oracle Database "BIWA" platform. Attendance Information & Replays at the BIWA website: oraclebiwa.org var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); try { var pageTracker = _gat._getTracker("UA-13185312-1"); pageTracker._trackPageview(); } catch(err) {}

    Read the article

  • SQL Monitor’s data repository

    - by Chris Lambrou
    As one of the developers of SQL Monitor, I often get requests passed on by our support people from customers who are looking to dip into SQL Monitor’s own data repository, in order to pull out bits of information that they’re interested in. Since there’s clearly interest out there in playing around directly with the data repository, I thought I’d write some blog posts to start to describe how it all works. The hardest part for me is knowing where to begin, since the schema of the data repository is pretty big. Hmmm… I guess it’s tricky for anyone to write anything but the most trivial of queries against the data repository without understanding the hierarchy of monitored objects, so perhaps my first post should start there. I always imagine that whenever a customer fires up SSMS and starts to explore their SQL Monitor data repository database, they become immediately bewildered by the schema – that was certainly my experience when I did so for the first time. The following query shows the number of different object types in the data repository schema: SELECT type_desc, COUNT(*) AS [count] FROM sys.objects GROUP BY type_desc ORDER BY type_desc;  type_desccount 1DEFAULT_CONSTRAINT63 2FOREIGN_KEY_CONSTRAINT181 3INTERNAL_TABLE3 4PRIMARY_KEY_CONSTRAINT190 5SERVICE_QUEUE3 6SQL_INLINE_TABLE_VALUED_FUNCTION381 7SQL_SCALAR_FUNCTION2 8SQL_STORED_PROCEDURE100 9SYSTEM_TABLE41 10UNIQUE_CONSTRAINT54 11USER_TABLE193 12VIEW124 With 193 tables, 124 views, 100 stored procedures and 381 table valued functions, that’s quite a hefty schema, and when you browse through it using SSMS, it can be a bit daunting at first. So, where to begin? Well, let’s narrow things down a bit and only look at the tables belonging to the data schema. That’s where all of the collected monitoring data is stored by SQL Monitor. The following query gives us the names of those tables: SELECT sch.name + '.' + obj.name AS [name] FROM sys.objects obj JOIN sys.schemas sch ON sch.schema_id = obj.schema_id WHERE obj.type_desc = 'USER_TABLE' AND sch.name = 'data' ORDER BY sch.name, obj.name; This query still returns 110 tables. I won’t show them all here, but let’s have a look at the first few of them:  name 1data.Cluster_Keys 2data.Cluster_Machine_ClockSkew_UnstableSamples 3data.Cluster_Machine_Cluster_StableSamples 4data.Cluster_Machine_Keys 5data.Cluster_Machine_LogicalDisk_Capacity_StableSamples 6data.Cluster_Machine_LogicalDisk_Keys 7data.Cluster_Machine_LogicalDisk_Sightings 8data.Cluster_Machine_LogicalDisk_UnstableSamples 9data.Cluster_Machine_LogicalDisk_Volume_StableSamples 10data.Cluster_Machine_Memory_Capacity_StableSamples 11data.Cluster_Machine_Memory_UnstableSamples 12data.Cluster_Machine_Network_Capacity_StableSamples 13data.Cluster_Machine_Network_Keys 14data.Cluster_Machine_Network_Sightings 15data.Cluster_Machine_Network_UnstableSamples 16data.Cluster_Machine_OperatingSystem_StableSamples 17data.Cluster_Machine_Ping_UnstableSamples 18data.Cluster_Machine_Process_Instances 19data.Cluster_Machine_Process_Keys 20data.Cluster_Machine_Process_Owner_Instances 21data.Cluster_Machine_Process_Sightings 22data.Cluster_Machine_Process_UnstableSamples 23… There are two things I want to draw your attention to: The table names describe a hierarchy of the different types of object that are monitored by SQL Monitor (e.g. clusters, machines and disks). For each object type in the hierarchy, there are multiple tables, ending in the suffixes _Keys, _Sightings, _StableSamples and _UnstableSamples. Not every object type has a table for every suffix, but the _Keys suffix is especially important and a _Keys table does indeed exist for every object type. In fact, if we limit the query to return only those tables ending in _Keys, we reveal the full object hierarchy: SELECT sch.name + '.' + obj.name AS [name] FROM sys.objects obj JOIN sys.schemas sch ON sch.schema_id = obj.schema_id WHERE obj.type_desc = 'USER_TABLE' AND sch.name = 'data' AND obj.name LIKE '%_Keys' ORDER BY sch.name, obj.name;  name 1data.Cluster_Keys 2data.Cluster_Machine_Keys 3data.Cluster_Machine_LogicalDisk_Keys 4data.Cluster_Machine_Network_Keys 5data.Cluster_Machine_Process_Keys 6data.Cluster_Machine_Services_Keys 7data.Cluster_ResourceGroup_Keys 8data.Cluster_ResourceGroup_Resource_Keys 9data.Cluster_SqlServer_Agent_Job_History_Keys 10data.Cluster_SqlServer_Agent_Job_Keys 11data.Cluster_SqlServer_Database_BackupType_Backup_Keys 12data.Cluster_SqlServer_Database_BackupType_Keys 13data.Cluster_SqlServer_Database_CustomMetric_Keys 14data.Cluster_SqlServer_Database_File_Keys 15data.Cluster_SqlServer_Database_Keys 16data.Cluster_SqlServer_Database_Table_Index_Keys 17data.Cluster_SqlServer_Database_Table_Keys 18data.Cluster_SqlServer_Error_Keys 19data.Cluster_SqlServer_Keys 20data.Cluster_SqlServer_Services_Keys 21data.Cluster_SqlServer_SqlProcess_Keys 22data.Cluster_SqlServer_TopQueries_Keys 23data.Cluster_SqlServer_Trace_Keys 24data.Group_Keys The full object type hierarchy looks like this: Cluster Machine LogicalDisk Network Process Services ResourceGroup Resource SqlServer Agent Job History Database BackupType Backup CustomMetric File Table Index Error Services SqlProcess TopQueries Trace Group Okay, but what about the individual objects themselves represented at each level in this hierarchy? Well that’s what the _Keys tables are for. This is probably best illustrated by way of a simple example – how can I query my own data repository to find the databases on my own PC for which monitoring data has been collected? Like this: SELECT clstr._Name AS cluster_name, srvr._Name AS instance_name, db._Name AS database_name FROM data.Cluster_SqlServer_Database_Keys db JOIN data.Cluster_SqlServer_Keys srvr ON db.ParentId = srvr.Id -- Note here how the parent of a Database is a Server JOIN data.Cluster_Keys clstr ON srvr.ParentId = clstr.Id -- Note here how the parent of a Server is a Cluster WHERE clstr._Name = 'dev-chrisl2' -- This is the hostname of my own PC ORDER BY clstr._Name, srvr._Name, db._Name;  cluster_nameinstance_namedatabase_name 1dev-chrisl2SqlMonitorData 2dev-chrisl2master 3dev-chrisl2model 4dev-chrisl2msdb 5dev-chrisl2mssqlsystemresource 6dev-chrisl2tempdb 7dev-chrisl2sql2005SqlMonitorData 8dev-chrisl2sql2005TestDatabase 9dev-chrisl2sql2005master 10dev-chrisl2sql2005model 11dev-chrisl2sql2005msdb 12dev-chrisl2sql2005mssqlsystemresource 13dev-chrisl2sql2005tempdb 14dev-chrisl2sql2008SqlMonitorData 15dev-chrisl2sql2008master 16dev-chrisl2sql2008model 17dev-chrisl2sql2008msdb 18dev-chrisl2sql2008mssqlsystemresource 19dev-chrisl2sql2008tempdb These results show that I have three SQL Server instances on my machine (a default instance, one named sql2005 and one named sql2008), and each instance has the usual set of system databases, along with a database named SqlMonitorData. Basically, this is where I test SQL Monitor on different versions of SQL Server, when I’m developing. There are a few important things we can learn from this query: Each _Keys table has a column named Id. This is the primary key. Each _Keys table has a column named ParentId. A foreign key relationship is defined between each _Keys table and its parent _Keys table in the hierarchy. There are two exceptions to this, Cluster_Keys and Group_Keys, because clusters and groups live at the root level of the object hierarchy. Each _Keys table has a column named _Name. This is used to uniquely identify objects in the table within the scope of the same shared parent object. Actually, that last item isn’t always true. In some cases, the _Name column is actually called something else. For example, the data.Cluster_Machine_Services_Keys table has a column named _ServiceName instead of _Name (sorry for the inconsistency). In other cases, a name isn’t sufficient to uniquely identify an object. For example, right now my PC has multiple processes running, all sharing the same name, Chrome (one for each tab open in my web-browser). In such cases, multiple columns are used to uniquely identify an object within the scope of the same shared parent object. Well, that’s it for now. I’ve given you enough information for you to explore the _Keys tables to see how objects are stored in your own data repositories. In a future post, I’ll try to explain how monitoring data is stored for each object, using the _StableSamples and _UnstableSamples tables. If you have any questions about this post, or suggestions for future posts, just submit them in the comments section below.

    Read the article

< Previous Page | 51 52 53 54 55 56 57 58 59 60 61 62  | Next Page >