subset - Page 36 - Developer IT

Introduction to the ASP.NET Web API

- by Stephen.Walther

I am a huge fan of Ajax. If you want to create a great experience for the users of your website – regardless of whether you are building an ASP.NET MVC or an ASP.NET Web Forms site — then you need to use Ajax. Otherwise, you are just being cruel to your customers. We use Ajax extensively in several of the ASP.NET applications that my company, Superexpert.com, builds. We expose data from the server as JSON and use jQuery to retrieve and update that data from the browser. One challenge, when building an ASP.NET website, is deciding on which technology to use to expose JSON data from the server. For example, how do you expose a list of products from the server as JSON so you can retrieve the list of products with jQuery? You have a number of options (too many options) including ASMX Web services, WCF Web Services, ASHX Generic Handlers, WCF Data Services, and MVC controller actions. Fortunately, the world has just been simplified. With the release of ASP.NET 4 Beta, Microsoft has introduced a new technology for exposing JSON from the server named the ASP.NET Web API. You can use the ASP.NET Web API with both ASP.NET MVC and ASP.NET Web Forms applications. The goal of this blog post is to provide you with a brief overview of the features of the new ASP.NET Web API. You learn how to use the ASP.NET Web API to retrieve, insert, update, and delete database records with jQuery. We also discuss how you can perform form validation when using the Web API and use OData when using the Web API. Creating an ASP.NET Web API Controller The ASP.NET Web API exposes JSON data through a new type of controller called an API controller. You can add an API controller to an existing ASP.NET MVC 4 project through the standard Add Controller dialog box. Right-click your Controllers folder and select Add, Controller. In the dialog box, name your controller MovieController and select the Empty API controller template: A brand new API controller looks like this: using System; using System.Collections.Generic; using System.Linq; using System.Net.Http; using System.Web.Http; namespace MyWebAPIApp.Controllers { public class MovieController : ApiController { } } An API controller, unlike a standard MVC controller, derives from the base ApiController class instead of the base Controller class. Using jQuery to Retrieve, Insert, Update, and Delete Data Let’s create an Ajaxified Movie Database application. We’ll retrieve, insert, update, and delete movies using jQuery with the MovieController which we just created. Our Movie model class looks like this: namespace MyWebAPIApp.Models { public class Movie { public int Id { get; set; } public string Title { get; set; } public string Director { get; set; } } } Our application will consist of a single HTML page named Movies.html. We’ll place all of our jQuery code in the Movies.html page. Getting a Single Record with the ASP.NET Web API To support retrieving a single movie from the server, we need to add a Get method to our API controller: using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Net.Http; using System.Web.Http; using MyWebAPIApp.Models; namespace MyWebAPIApp.Controllers { public class MovieController : ApiController { public Movie GetMovie(int id) { // Return movie by id if (id == 1) { return new Movie { Id = 1, Title = "Star Wars", Director = "Lucas" }; } // Otherwise, movie was not found throw new HttpResponseException(HttpStatusCode.NotFound); } } } In the code above, the GetMovie() method accepts the Id of a movie. If the Id has the value 1 then the method returns the movie Star Wars. Otherwise, the method throws an exception and returns 404 Not Found HTTP status code. After building your project, you can invoke the MovieController.GetMovie() method by entering the following URL in your web browser address bar: http://localhost:[port]/api/movie/1 (You’ll need to enter the correct randomly generated port). In the URL api/movie/1, the first “api” segment indicates that this is a Web API route. The “movie” segment indicates that the MovieController should be invoked. You do not specify the name of the action. Instead, the HTTP method used to make the request – GET, POST, PUT, DELETE — is used to identify the action to invoke. The ASP.NET Web API uses different routing conventions than normal ASP.NET MVC controllers. When you make an HTTP GET request then any API controller method with a name that starts with “GET” is invoked. So, we could have called our API controller action GetPopcorn() instead of GetMovie() and it would still be invoked by the URL api/movie/1. The default route for the Web API is defined in the Global.asax file and it looks like this: routes.MapHttpRoute( name: "DefaultApi", routeTemplate: "api/{controller}/{id}", defaults: new { id = RouteParameter.Optional } ); We can invoke our GetMovie() controller action with the jQuery code in the following HTML page: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Get Movie</title> </head> <body> <div> Title: <span id="title"></span> </div> <div> Director: <span id="director"></span> </div> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> getMovie(1, function (movie) { $("#title").html(movie.Title); $("#director").html(movie.Director); }); function getMovie(id, callback) { $.ajax({ url: "/api/Movie", data: { id: id }, type: "GET", contentType: "application/json;charset=utf-8", statusCode: { 200: function (movie) { callback(movie); }, 404: function () { alert("Not Found!"); } } }); } </script> </body> </html> In the code above, the jQuery $.ajax() method is used to invoke the GetMovie() method. Notice that the Ajax call handles two HTTP response codes. When the GetMove() method successfully returns a movie, the method returns a 200 status code. In that case, the details of the movie are displayed in the HTML page. Otherwise, if the movie is not found, the GetMovie() method returns a 404 status code. In that case, the page simply displays an alert box indicating that the movie was not found (hopefully, you would implement something more graceful in an actual application). You can use your browser’s Developer Tools to see what is going on in the background when you open the HTML page (hit F12 in the most recent version of most browsers). For example, you can use the Network tab in Google Chrome to see the Ajax request which invokes the GetMovie() method: Getting a Set of Records with the ASP.NET Web API Let’s modify our Movie API controller so that it returns a collection of movies. The following Movie controller has a new ListMovies() method which returns a (hard-coded) collection of movies: using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Net.Http; using System.Web.Http; using MyWebAPIApp.Models; namespace MyWebAPIApp.Controllers { public class MovieController : ApiController { public IEnumerable<Movie> ListMovies() { return new List<Movie> { new Movie {Id=1, Title="Star Wars", Director="Lucas"}, new Movie {Id=1, Title="King Kong", Director="Jackson"}, new Movie {Id=1, Title="Memento", Director="Nolan"} }; } } } Because we named our action ListMovies(), the default Web API route will never match it. Therefore, we need to add the following custom route to our Global.asax file (at the top of the RegisterRoutes() method): routes.MapHttpRoute( name: "ActionApi", routeTemplate: "api/{controller}/{action}/{id}", defaults: new { id = RouteParameter.Optional } ); This route enables us to invoke the ListMovies() method with the URL /api/movie/listmovies. Now that we have exposed our collection of movies from the server, we can retrieve and display the list of movies using jQuery in our HTML page: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>List Movies</title> </head> <body> <div id="movies"></div> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> listMovies(function (movies) { var strMovies=""; $.each(movies, function (index, movie) { strMovies += "<div>" + movie.Title + "</div>"; }); $("#movies").html(strMovies); }); function listMovies(callback) { $.ajax({ url: "/api/Movie/ListMovies", data: {}, type: "GET", contentType: "application/json;charset=utf-8", }).then(function(movies){ callback(movies); }); } </script> </body> </html> Inserting a Record with the ASP.NET Web API Now let’s modify our Movie API controller so it supports creating new records: public HttpResponseMessage<Movie> PostMovie(Movie movieToCreate) { // Add movieToCreate to the database and update primary key movieToCreate.Id = 23; // Build a response that contains the location of the new movie var response = new HttpResponseMessage<Movie>(movieToCreate, HttpStatusCode.Created); var relativePath = "/api/movie/" + movieToCreate.Id; response.Headers.Location = new Uri(Request.RequestUri, relativePath); return response; } The PostMovie() method in the code above accepts a movieToCreate parameter. We don’t actually store the new movie anywhere. In real life, you will want to call a service method to store the new movie in a database. When you create a new resource, such as a new movie, you should return the location of the new resource. In the code above, the URL where the new movie can be retrieved is assigned to the Location header returned in the PostMovie() response. Because the name of our method starts with “Post”, we don’t need to create a custom route. The PostMovie() method can be invoked with the URL /Movie/PostMovie – just as long as the method is invoked within the context of a HTTP POST request. The following HTML page invokes the PostMovie() method. <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Create Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> var movieToCreate = { title: "The Hobbit", director: "Jackson" }; createMovie(movieToCreate, function (newMovie) { alert("New movie created with an Id of " + newMovie.Id); }); function createMovie(movieToCreate, callback) { $.ajax({ url: "/api/Movie", data: JSON.stringify( movieToCreate ), type: "POST", contentType: "application/json;charset=utf-8", statusCode: { 201: function (newMovie) { callback(newMovie); } } }); } </script> </body> </html> This page creates a new movie (the Hobbit) by calling the createMovie() method. The page simply displays the Id of the new movie: The HTTP Post operation is performed with the following call to the jQuery $.ajax() method: $.ajax({ url: "/api/Movie", data: JSON.stringify( movieToCreate ), type: "POST", contentType: "application/json;charset=utf-8", statusCode: { 201: function (newMovie) { callback(newMovie); } } }); Notice that the type of Ajax request is a POST request. This is required to match the PostMovie() method. Notice, furthermore, that the new movie is converted into JSON using JSON.stringify(). The JSON.stringify() method takes a JavaScript object and converts it into a JSON string. Finally, notice that success is represented with a 201 status code. The HttpStatusCode.Created value returned from the PostMovie() method returns a 201 status code. Updating a Record with the ASP.NET Web API Here’s how we can modify the Movie API controller to support updating an existing record. In this case, we need to create a PUT method to handle an HTTP PUT request: public void PutMovie(Movie movieToUpdate) { if (movieToUpdate.Id == 1) { // Update the movie in the database return; } // If you can't find the movie to update throw new HttpResponseException(HttpStatusCode.NotFound); } Unlike our PostMovie() method, the PutMovie() method does not return a result. The action either updates the database or, if the movie cannot be found, returns an HTTP Status code of 404. The following HTML page illustrates how you can invoke the PutMovie() method: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Put Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> var movieToUpdate = { id: 1, title: "The Hobbit", director: "Jackson" }; updateMovie(movieToUpdate, function () { alert("Movie updated!"); }); function updateMovie(movieToUpdate, callback) { $.ajax({ url: "/api/Movie", data: JSON.stringify(movieToUpdate), type: "PUT", contentType: "application/json;charset=utf-8", statusCode: { 200: function () { callback(); }, 404: function () { alert("Movie not found!"); } } }); } </script> </body> </html> Deleting a Record with the ASP.NET Web API Here’s the code for deleting a movie: public HttpResponseMessage DeleteMovie(int id) { // Delete the movie from the database // Return status code return new HttpResponseMessage(HttpStatusCode.NoContent); } This method simply deletes the movie (well, not really, but pretend that it does) and returns a No Content status code (204). The following page illustrates how you can invoke the DeleteMovie() action: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Delete Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> deleteMovie(1, function () { alert("Movie deleted!"); }); function deleteMovie(id, callback) { $.ajax({ url: "/api/Movie", data: JSON.stringify({id:id}), type: "DELETE", contentType: "application/json;charset=utf-8", statusCode: { 204: function () { callback(); } } }); } </script> </body> </html> Performing Validation How do you perform form validation when using the ASP.NET Web API? Because validation in ASP.NET MVC is driven by the Default Model Binder, and because the Web API uses the Default Model Binder, you get validation for free. Let’s modify our Movie class so it includes some of the standard validation attributes: using System.ComponentModel.DataAnnotations; namespace MyWebAPIApp.Models { public class Movie { public int Id { get; set; } [Required(ErrorMessage="Title is required!")] [StringLength(5, ErrorMessage="Title cannot be more than 5 characters!")] public string Title { get; set; } [Required(ErrorMessage="Director is required!")] public string Director { get; set; } } } In the code above, the Required validation attribute is used to make both the Title and Director properties required. The StringLength attribute is used to require the length of the movie title to be no more than 5 characters. Now let’s modify our PostMovie() action to validate a movie before adding the movie to the database: public HttpResponseMessage PostMovie(Movie movieToCreate) { // Validate movie if (!ModelState.IsValid) { var errors = new JsonArray(); foreach (var prop in ModelState.Values) { if (prop.Errors.Any()) { errors.Add(prop.Errors.First().ErrorMessage); } } return new HttpResponseMessage<JsonValue>(errors, HttpStatusCode.BadRequest); } // Add movieToCreate to the database and update primary key movieToCreate.Id = 23; // Build a response that contains the location of the new movie var response = new HttpResponseMessage<Movie>(movieToCreate, HttpStatusCode.Created); var relativePath = "/api/movie/" + movieToCreate.Id; response.Headers.Location = new Uri(Request.RequestUri, relativePath); return response; } If ModelState.IsValid has the value false then the errors in model state are copied to a new JSON array. Each property – such as the Title and Director property — can have multiple errors. In the code above, only the first error message is copied over. The JSON array is returned with a Bad Request status code (400 status code). The following HTML page illustrates how you can invoke our modified PostMovie() action and display any error messages: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Create Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> var movieToCreate = { title: "The Hobbit", director: "" }; createMovie(movieToCreate, function (newMovie) { alert("New movie created with an Id of " + newMovie.Id); }, function (errors) { var strErrors = ""; $.each(errors, function(index, err) { strErrors += "*" + err + "\n"; }); alert(strErrors); } ); function createMovie(movieToCreate, success, fail) { $.ajax({ url: "/api/Movie", data: JSON.stringify(movieToCreate), type: "POST", contentType: "application/json;charset=utf-8", statusCode: { 201: function (newMovie) { success(newMovie); }, 400: function (xhr) { var errors = JSON.parse(xhr.responseText); fail(errors); } } }); } </script> </body> </html> The createMovie() function performs an Ajax request and handles either a 201 or a 400 status code from the response. If a 201 status code is returned then there were no validation errors and the new movie was created. If, on the other hand, a 400 status code is returned then there was a validation error. The validation errors are retrieved from the XmlHttpRequest responseText property. The error messages are displayed in an alert: (Please don’t use JavaScript alert dialogs to display validation errors, I just did it this way out of pure laziness) This validation code in our PostMovie() method is pretty generic. There is nothing specific about this code to the PostMovie() method. In the following video, Jon Galloway demonstrates how to create a global Validation filter which can be used with any API controller action: http://www.asp.net/web-api/overview/web-api-routing-and-actions/video-custom-validation His validation filter looks like this: using System.Json; using System.Linq; using System.Net; using System.Net.Http; using System.Web.Http.Controllers; using System.Web.Http.Filters; namespace MyWebAPIApp.Filters { public class ValidationActionFilter:ActionFilterAttribute { public override void OnActionExecuting(HttpActionContext actionContext) { var modelState = actionContext.ModelState; if (!modelState.IsValid) { dynamic errors = new JsonObject(); foreach (var key in modelState.Keys) { var state = modelState[key]; if (state.Errors.Any()) { errors[key] = state.Errors.First().ErrorMessage; } } actionContext.Response = new HttpResponseMessage<JsonValue>(errors, HttpStatusCode.BadRequest); } } } } And you can register the validation filter in the Application_Start() method in the Global.asax file like this: GlobalConfiguration.Configuration.Filters.Add(new ValidationActionFilter()); After you register the Validation filter, validation error messages are returned from any API controller action method automatically when validation fails. You don’t need to add any special logic to any of your API controller actions to take advantage of the filter. Querying using OData The OData protocol is an open protocol created by Microsoft which enables you to perform queries over the web. The official website for OData is located here: http://odata.org For example, here are some of the query options which you can use with OData: · $orderby – Enables you to retrieve results in a certain order. · $top – Enables you to retrieve a certain number of results. · $skip – Enables you to skip over a certain number of results (use with $top for paging). · $filter – Enables you to filter the results returned. The ASP.NET Web API supports a subset of the OData protocol. You can use all of the query options listed above when interacting with an API controller. The only requirement is that the API controller action returns its data as IQueryable. For example, the following Movie controller has an action named GetMovies() which returns an IQueryable of movies: public IQueryable<Movie> GetMovies() { return new List<Movie> { new Movie {Id=1, Title="Star Wars", Director="Lucas"}, new Movie {Id=2, Title="King Kong", Director="Jackson"}, new Movie {Id=3, Title="Willow", Director="Lucas"}, new Movie {Id=4, Title="Shrek", Director="Smith"}, new Movie {Id=5, Title="Memento", Director="Nolan"} }.AsQueryable(); } If you enter the following URL in your browser: /api/movie?$top=2&$orderby=Title Then you will limit the movies returned to the top 2 in order of the movie Title. You will get the following results: By using the $top option in combination with the $skip option, you can enable client-side paging. For example, you can use $top and $skip to page through thousands of products, 10 products at a time. The $filter query option is very powerful. You can use this option to filter the results from a query. Here are some examples: Return every movie directed by Lucas: /api/movie?$filter=Director eq ‘Lucas’ Return every movie which has a title which starts with ‘S’: /api/movie?$filter=startswith(Title,’S') Return every movie which has an Id greater than 2: /api/movie?$filter=Id gt 2 The complete documentation for the $filter option is located here: http://www.odata.org/developers/protocols/uri-conventions#FilterSystemQueryOption Summary The goal of this blog entry was to provide you with an overview of the new ASP.NET Web API introduced with the Beta release of ASP.NET 4. In this post, I discussed how you can retrieve, insert, update, and delete data by using jQuery with the Web API. I also discussed how you can use the standard validation attributes with the Web API. You learned how to return validation error messages to the client and display the error messages using jQuery. Finally, we briefly discussed how the ASP.NET Web API supports the OData protocol. For example, you learned how to filter records returned from an API controller action by using the $filter query option. I’m excited about the new Web API. This is a feature which I expect to use with almost every ASP.NET application which I build in the future.

Read the article

Introduction to the ASP.NET Web API

- by Stephen.Walther

I am a huge fan of Ajax. If you want to create a great experience for the users of your website – regardless of whether you are building an ASP.NET MVC or an ASP.NET Web Forms site — then you need to use Ajax. Otherwise, you are just being cruel to your customers. We use Ajax extensively in several of the ASP.NET applications that my company, Superexpert.com, builds. We expose data from the server as JSON and use jQuery to retrieve and update that data from the browser. One challenge, when building an ASP.NET website, is deciding on which technology to use to expose JSON data from the server. For example, how do you expose a list of products from the server as JSON so you can retrieve the list of products with jQuery? You have a number of options (too many options) including ASMX Web services, WCF Web Services, ASHX Generic Handlers, WCF Data Services, and MVC controller actions. Fortunately, the world has just been simplified. With the release of ASP.NET 4 Beta, Microsoft has introduced a new technology for exposing JSON from the server named the ASP.NET Web API. You can use the ASP.NET Web API with both ASP.NET MVC and ASP.NET Web Forms applications. The goal of this blog post is to provide you with a brief overview of the features of the new ASP.NET Web API. You learn how to use the ASP.NET Web API to retrieve, insert, update, and delete database records with jQuery. We also discuss how you can perform form validation when using the Web API and use OData when using the Web API. Creating an ASP.NET Web API Controller The ASP.NET Web API exposes JSON data through a new type of controller called an API controller. You can add an API controller to an existing ASP.NET MVC 4 project through the standard Add Controller dialog box. Right-click your Controllers folder and select Add, Controller. In the dialog box, name your controller MovieController and select the Empty API controller template: A brand new API controller looks like this: using System; using System.Collections.Generic; using System.Linq; using System.Net.Http; using System.Web.Http; namespace MyWebAPIApp.Controllers { public class MovieController : ApiController { } } An API controller, unlike a standard MVC controller, derives from the base ApiController class instead of the base Controller class. Using jQuery to Retrieve, Insert, Update, and Delete Data Let’s create an Ajaxified Movie Database application. We’ll retrieve, insert, update, and delete movies using jQuery with the MovieController which we just created. Our Movie model class looks like this: namespace MyWebAPIApp.Models { public class Movie { public int Id { get; set; } public string Title { get; set; } public string Director { get; set; } } } Our application will consist of a single HTML page named Movies.html. We’ll place all of our jQuery code in the Movies.html page. Getting a Single Record with the ASP.NET Web API To support retrieving a single movie from the server, we need to add a Get method to our API controller: using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Net.Http; using System.Web.Http; using MyWebAPIApp.Models; namespace MyWebAPIApp.Controllers { public class MovieController : ApiController { public Movie GetMovie(int id) { // Return movie by id if (id == 1) { return new Movie { Id = 1, Title = "Star Wars", Director = "Lucas" }; } // Otherwise, movie was not found throw new HttpResponseException(HttpStatusCode.NotFound); } } } In the code above, the GetMovie() method accepts the Id of a movie. If the Id has the value 1 then the method returns the movie Star Wars. Otherwise, the method throws an exception and returns 404 Not Found HTTP status code. After building your project, you can invoke the MovieController.GetMovie() method by entering the following URL in your web browser address bar: http://localhost:[port]/api/movie/1 (You’ll need to enter the correct randomly generated port). In the URL api/movie/1, the first “api” segment indicates that this is a Web API route. The “movie” segment indicates that the MovieController should be invoked. You do not specify the name of the action. Instead, the HTTP method used to make the request – GET, POST, PUT, DELETE — is used to identify the action to invoke. The ASP.NET Web API uses different routing conventions than normal ASP.NET MVC controllers. When you make an HTTP GET request then any API controller method with a name that starts with “GET” is invoked. So, we could have called our API controller action GetPopcorn() instead of GetMovie() and it would still be invoked by the URL api/movie/1. The default route for the Web API is defined in the Global.asax file and it looks like this: routes.MapHttpRoute( name: "DefaultApi", routeTemplate: "api/{controller}/{id}", defaults: new { id = RouteParameter.Optional } ); We can invoke our GetMovie() controller action with the jQuery code in the following HTML page: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Get Movie</title> </head> <body> <div> Title: <span id="title"></span> </div> <div> Director: <span id="director"></span> </div> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> getMovie(1, function (movie) { $("#title").html(movie.Title); $("#director").html(movie.Director); }); function getMovie(id, callback) { $.ajax({ url: "/api/Movie", data: { id: id }, type: "GET", contentType: "application/json;charset=utf-8", statusCode: { 200: function (movie) { callback(movie); }, 404: function () { alert("Not Found!"); } } }); } </script> </body> </html> In the code above, the jQuery $.ajax() method is used to invoke the GetMovie() method. Notice that the Ajax call handles two HTTP response codes. When the GetMove() method successfully returns a movie, the method returns a 200 status code. In that case, the details of the movie are displayed in the HTML page. Otherwise, if the movie is not found, the GetMovie() method returns a 404 status code. In that case, the page simply displays an alert box indicating that the movie was not found (hopefully, you would implement something more graceful in an actual application). You can use your browser’s Developer Tools to see what is going on in the background when you open the HTML page (hit F12 in the most recent version of most browsers). For example, you can use the Network tab in Google Chrome to see the Ajax request which invokes the GetMovie() method: Getting a Set of Records with the ASP.NET Web API Let’s modify our Movie API controller so that it returns a collection of movies. The following Movie controller has a new ListMovies() method which returns a (hard-coded) collection of movies: using System; using System.Collections.Generic; using System.Linq; using System.Net; using System.Net.Http; using System.Web.Http; using MyWebAPIApp.Models; namespace MyWebAPIApp.Controllers { public class MovieController : ApiController { public IEnumerable<Movie> ListMovies() { return new List<Movie> { new Movie {Id=1, Title="Star Wars", Director="Lucas"}, new Movie {Id=1, Title="King Kong", Director="Jackson"}, new Movie {Id=1, Title="Memento", Director="Nolan"} }; } } } Because we named our action ListMovies(), the default Web API route will never match it. Therefore, we need to add the following custom route to our Global.asax file (at the top of the RegisterRoutes() method): routes.MapHttpRoute( name: "ActionApi", routeTemplate: "api/{controller}/{action}/{id}", defaults: new { id = RouteParameter.Optional } ); This route enables us to invoke the ListMovies() method with the URL /api/movie/listmovies. Now that we have exposed our collection of movies from the server, we can retrieve and display the list of movies using jQuery in our HTML page: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>List Movies</title> </head> <body> <div id="movies"></div> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> listMovies(function (movies) { var strMovies=""; $.each(movies, function (index, movie) { strMovies += "<div>" + movie.Title + "</div>"; }); $("#movies").html(strMovies); }); function listMovies(callback) { $.ajax({ url: "/api/Movie/ListMovies", data: {}, type: "GET", contentType: "application/json;charset=utf-8", }).then(function(movies){ callback(movies); }); } </script> </body> </html> Inserting a Record with the ASP.NET Web API Now let’s modify our Movie API controller so it supports creating new records: public HttpResponseMessage<Movie> PostMovie(Movie movieToCreate) { // Add movieToCreate to the database and update primary key movieToCreate.Id = 23; // Build a response that contains the location of the new movie var response = new HttpResponseMessage<Movie>(movieToCreate, HttpStatusCode.Created); var relativePath = "/api/movie/" + movieToCreate.Id; response.Headers.Location = new Uri(Request.RequestUri, relativePath); return response; } The PostMovie() method in the code above accepts a movieToCreate parameter. We don’t actually store the new movie anywhere. In real life, you will want to call a service method to store the new movie in a database. When you create a new resource, such as a new movie, you should return the location of the new resource. In the code above, the URL where the new movie can be retrieved is assigned to the Location header returned in the PostMovie() response. Because the name of our method starts with “Post”, we don’t need to create a custom route. The PostMovie() method can be invoked with the URL /Movie/PostMovie – just as long as the method is invoked within the context of a HTTP POST request. The following HTML page invokes the PostMovie() method. <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Create Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> var movieToCreate = { title: "The Hobbit", director: "Jackson" }; createMovie(movieToCreate, function (newMovie) { alert("New movie created with an Id of " + newMovie.Id); }); function createMovie(movieToCreate, callback) { $.ajax({ url: "/api/Movie", data: JSON.stringify( movieToCreate ), type: "POST", contentType: "application/json;charset=utf-8", statusCode: { 201: function (newMovie) { callback(newMovie); } } }); } </script> </body> </html> This page creates a new movie (the Hobbit) by calling the createMovie() method. The page simply displays the Id of the new movie: The HTTP Post operation is performed with the following call to the jQuery $.ajax() method: $.ajax({ url: "/api/Movie", data: JSON.stringify( movieToCreate ), type: "POST", contentType: "application/json;charset=utf-8", statusCode: { 201: function (newMovie) { callback(newMovie); } } }); Notice that the type of Ajax request is a POST request. This is required to match the PostMovie() method. Notice, furthermore, that the new movie is converted into JSON using JSON.stringify(). The JSON.stringify() method takes a JavaScript object and converts it into a JSON string. Finally, notice that success is represented with a 201 status code. The HttpStatusCode.Created value returned from the PostMovie() method returns a 201 status code. Updating a Record with the ASP.NET Web API Here’s how we can modify the Movie API controller to support updating an existing record. In this case, we need to create a PUT method to handle an HTTP PUT request: public void PutMovie(Movie movieToUpdate) { if (movieToUpdate.Id == 1) { // Update the movie in the database return; } // If you can't find the movie to update throw new HttpResponseException(HttpStatusCode.NotFound); } Unlike our PostMovie() method, the PutMovie() method does not return a result. The action either updates the database or, if the movie cannot be found, returns an HTTP Status code of 404. The following HTML page illustrates how you can invoke the PutMovie() method: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Put Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> var movieToUpdate = { id: 1, title: "The Hobbit", director: "Jackson" }; updateMovie(movieToUpdate, function () { alert("Movie updated!"); }); function updateMovie(movieToUpdate, callback) { $.ajax({ url: "/api/Movie", data: JSON.stringify(movieToUpdate), type: "PUT", contentType: "application/json;charset=utf-8", statusCode: { 200: function () { callback(); }, 404: function () { alert("Movie not found!"); } } }); } </script> </body> </html> Deleting a Record with the ASP.NET Web API Here’s the code for deleting a movie: public HttpResponseMessage DeleteMovie(int id) { // Delete the movie from the database // Return status code return new HttpResponseMessage(HttpStatusCode.NoContent); } This method simply deletes the movie (well, not really, but pretend that it does) and returns a No Content status code (204). The following page illustrates how you can invoke the DeleteMovie() action: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Delete Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> deleteMovie(1, function () { alert("Movie deleted!"); }); function deleteMovie(id, callback) { $.ajax({ url: "/api/Movie", data: JSON.stringify({id:id}), type: "DELETE", contentType: "application/json;charset=utf-8", statusCode: { 204: function () { callback(); } } }); } </script> </body> </html> Performing Validation How do you perform form validation when using the ASP.NET Web API? Because validation in ASP.NET MVC is driven by the Default Model Binder, and because the Web API uses the Default Model Binder, you get validation for free. Let’s modify our Movie class so it includes some of the standard validation attributes: using System.ComponentModel.DataAnnotations; namespace MyWebAPIApp.Models { public class Movie { public int Id { get; set; } [Required(ErrorMessage="Title is required!")] [StringLength(5, ErrorMessage="Title cannot be more than 5 characters!")] public string Title { get; set; } [Required(ErrorMessage="Director is required!")] public string Director { get; set; } } } In the code above, the Required validation attribute is used to make both the Title and Director properties required. The StringLength attribute is used to require the length of the movie title to be no more than 5 characters. Now let’s modify our PostMovie() action to validate a movie before adding the movie to the database: public HttpResponseMessage PostMovie(Movie movieToCreate) { // Validate movie if (!ModelState.IsValid) { var errors = new JsonArray(); foreach (var prop in ModelState.Values) { if (prop.Errors.Any()) { errors.Add(prop.Errors.First().ErrorMessage); } } return new HttpResponseMessage<JsonValue>(errors, HttpStatusCode.BadRequest); } // Add movieToCreate to the database and update primary key movieToCreate.Id = 23; // Build a response that contains the location of the new movie var response = new HttpResponseMessage<Movie>(movieToCreate, HttpStatusCode.Created); var relativePath = "/api/movie/" + movieToCreate.Id; response.Headers.Location = new Uri(Request.RequestUri, relativePath); return response; } If ModelState.IsValid has the value false then the errors in model state are copied to a new JSON array. Each property – such as the Title and Director property — can have multiple errors. In the code above, only the first error message is copied over. The JSON array is returned with a Bad Request status code (400 status code). The following HTML page illustrates how you can invoke our modified PostMovie() action and display any error messages: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Create Movie</title> </head> <body> <script type="text/javascript" src="Scripts/jquery-1.6.2.min.js"></script> <script type="text/javascript"> var movieToCreate = { title: "The Hobbit", director: "" }; createMovie(movieToCreate, function (newMovie) { alert("New movie created with an Id of " + newMovie.Id); }, function (errors) { var strErrors = ""; $.each(errors, function(index, err) { strErrors += "*" + err + "n"; }); alert(strErrors); } ); function createMovie(movieToCreate, success, fail) { $.ajax({ url: "/api/Movie", data: JSON.stringify(movieToCreate), type: "POST", contentType: "application/json;charset=utf-8", statusCode: { 201: function (newMovie) { success(newMovie); }, 400: function (xhr) { var errors = JSON.parse(xhr.responseText); fail(errors); } } }); } </script> </body> </html> The createMovie() function performs an Ajax request and handles either a 201 or a 400 status code from the response. If a 201 status code is returned then there were no validation errors and the new movie was created. If, on the other hand, a 400 status code is returned then there was a validation error. The validation errors are retrieved from the XmlHttpRequest responseText property. The error messages are displayed in an alert: (Please don’t use JavaScript alert dialogs to display validation errors, I just did it this way out of pure laziness) This validation code in our PostMovie() method is pretty generic. There is nothing specific about this code to the PostMovie() method. In the following video, Jon Galloway demonstrates how to create a global Validation filter which can be used with any API controller action: http://www.asp.net/web-api/overview/web-api-routing-and-actions/video-custom-validation His validation filter looks like this: using System.Json; using System.Linq; using System.Net; using System.Net.Http; using System.Web.Http.Controllers; using System.Web.Http.Filters; namespace MyWebAPIApp.Filters { public class ValidationActionFilter:ActionFilterAttribute { public override void OnActionExecuting(HttpActionContext actionContext) { var modelState = actionContext.ModelState; if (!modelState.IsValid) { dynamic errors = new JsonObject(); foreach (var key in modelState.Keys) { var state = modelState[key]; if (state.Errors.Any()) { errors[key] = state.Errors.First().ErrorMessage; } } actionContext.Response = new HttpResponseMessage<JsonValue>(errors, HttpStatusCode.BadRequest); } } } } And you can register the validation filter in the Application_Start() method in the Global.asax file like this: GlobalConfiguration.Configuration.Filters.Add(new ValidationActionFilter()); After you register the Validation filter, validation error messages are returned from any API controller action method automatically when validation fails. You don’t need to add any special logic to any of your API controller actions to take advantage of the filter. Querying using OData The OData protocol is an open protocol created by Microsoft which enables you to perform queries over the web. The official website for OData is located here: http://odata.org For example, here are some of the query options which you can use with OData: · $orderby – Enables you to retrieve results in a certain order. · $top – Enables you to retrieve a certain number of results. · $skip – Enables you to skip over a certain number of results (use with $top for paging). · $filter – Enables you to filter the results returned. The ASP.NET Web API supports a subset of the OData protocol. You can use all of the query options listed above when interacting with an API controller. The only requirement is that the API controller action returns its data as IQueryable. For example, the following Movie controller has an action named GetMovies() which returns an IQueryable of movies: public IQueryable<Movie> GetMovies() { return new List<Movie> { new Movie {Id=1, Title="Star Wars", Director="Lucas"}, new Movie {Id=2, Title="King Kong", Director="Jackson"}, new Movie {Id=3, Title="Willow", Director="Lucas"}, new Movie {Id=4, Title="Shrek", Director="Smith"}, new Movie {Id=5, Title="Memento", Director="Nolan"} }.AsQueryable(); } If you enter the following URL in your browser: /api/movie?$top=2&$orderby=Title Then you will limit the movies returned to the top 2 in order of the movie Title. You will get the following results: By using the $top option in combination with the $skip option, you can enable client-side paging. For example, you can use $top and $skip to page through thousands of products, 10 products at a time. The $filter query option is very powerful. You can use this option to filter the results from a query. Here are some examples: Return every movie directed by Lucas: /api/movie?$filter=Director eq ‘Lucas’ Return every movie which has a title which starts with ‘S’: /api/movie?$filter=startswith(Title,’S') Return every movie which has an Id greater than 2: /api/movie?$filter=Id gt 2 The complete documentation for the $filter option is located here: http://www.odata.org/developers/protocols/uri-conventions#FilterSystemQueryOption Summary The goal of this blog entry was to provide you with an overview of the new ASP.NET Web API introduced with the Beta release of ASP.NET 4. In this post, I discussed how you can retrieve, insert, update, and delete data by using jQuery with the Web API. I also discussed how you can use the standard validation attributes with the Web API. You learned how to return validation error messages to the client and display the error messages using jQuery. Finally, we briefly discussed how the ASP.NET Web API supports the OData protocol. For example, you learned how to filter records returned from an API controller action by using the $filter query option. I’m excited about the new Web API. This is a feature which I expect to use with almost every ASP.NET application which I build in the future.

Read the article

Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article

Data Modeling: Logical Modeling Exercise

- by swisscheese

In trying to learn the art of data storage I have been trying to take in as much solid information as possible. PerformanceDBA posted some really helpful tutorials/examples in the following posts among others: is my data normalized? and Relational table naming convention. I already asked a subset question of this model here. So to make sure I understood the concepts he presented and I have seen elsewhere I wanted to take things a step or two further and see if I am grasping the concepts. Hence the purpose of this post, which hopefully others can also learn from. Everything I present is conceptual to me and for learning rather than applying it in some production system. It would be cool to get some input from PerformanceDBA also since I used his models to get started, but I appreciate all input given from anyone. As I am new to databases and especially modeling I will be the first to admit that I may not always ask the right questions, explain my thoughts clearly, or use the right verbage due to lack of expertise on the subject. So please keep that in mind and feel free to steer me in the right direction if I head off track. If there is enough interest in this I would like to take this from the logical to physical phases to show the evolution of the process and share it here on Stack. I will keep this thread for the Logical Diagram though and start new one for the additional steps. For my understanding I will be building a MySQL DB in the end to run some tests and see if what I came up with actually works. Here is the list of things that I want to capture in this conceptual model. Edit for V1.2 The purpose of this is to list Bands, their members, and the Events that they will be appearing at, as well as offer music and other merchandise for sale Members will be able to match up with friends Members can write reviews on the Bands, their music, and their events. There can only be one review per member on a given item, although they can edit their reviews and history will be maintained. BandMembers will have the chance to write a single Comment on Reviews about the Band they are associated with. Collectively as a Band only one Comment is allowed per Review. Members can then rate all Reviews and Comments but only once per given instance Members can select their favorite Bands, music, Merchandise, and Events Bands, Songs, and Events will be categorized into the type of Genre that they are and then further subcategorized into a SubGenre if necessary. It is ok for a Band or Event to fall into more then one Genre/SubGenre combination. Event date, time, and location will be posted for a given band and members can show that they will be attending the Event. An Event can be comprised of more than one Band, and multiple Events can take place at a single location on the same day Every party will be tied to at least one address and address history shall be maintained. Each party could also be tied to more then one address at a time (i.e. billing, shipping, physical) There will be stored profiles for Bands, BandMembers, and general members. So there it is, maybe a bit involved but could be a great learning tool for many hopefully as the process evolves and input is given by the community. Any input? EDIT v1.1 In response to PerformanceDBA U.3) That means no merchandise other than Band merchandise in the database. Correct ? That was my original thought but you got me thinking. Maybe the site would want to sell its own merchandise or even other merchandise from the bands. Not sure a mod to make for that. Would it require an entire rework of the Catalog section or just the identifying relationship that exists with the Band? Attempted a mod to sell both complete albums or song. Either way they would both be in electronic format only available for download. That is why I listed an Album as being comprised of Songs rather then 2 separate entities. U.5) I understand what you bring up about the circular relation with Favorite. I would like to get to this “It is either one Entity with some form of differentiation (FavoriteType) which identifies its treatment” but how to is not clear to me. What am I missing here? u.6) “Business Rules This is probably the only area you are weak in.” Thanks for the honest response. I will readdress these but I hope to clear up some confusion in my head first with the responses I have posted back to you. Q.1) Yes I would like to have Accepted, Rejected, and Blocked. I am not sure what you are referring to as to how this would change the logical model? Q.2) A person does not have to be a User. They can exist only as a BandMember. Is that what you are asking? Minor Issue Zero, One, or More…Oops I admit I forgot to give this attention when building the model. I am submitting this version as is and will address in a future version. I need to read up more on Constraint Checking to make sure I am understanding things. M.4) Depends if you envision OrderPurchase in the future. Can you expand as to what you mean here? EDIT V1.2 In response to PerformanceDBA input... Lessons learned. I was mixing the concept of Identifying / Non-Identifying and Cardinality (i.e. Genre / SubGenre), and doing so inconsistently to make things worse. Associative Tables are not required in Logical Diagrams as their many-to-many relationships can be depicted and then expanded in the Physical Model. I was overlooking the Cardinality in a lot of the relationships The importance of reading through relationships using effective Verb Phrases to reassure I am modeling what I want to accomplish. U.2) In the concept of this model it is only required to track a Venue as a location for an Event. No further data needs to be collected. With that being said Events will take place on a given EventDate and will be hosted at a Venue. Venues will host multiple events and possibly multiple events on a given date. In my new model my thinking was that EventDate is already tied to Event . Therefore, Venue will not need a relationship with EventDate. The 5th and 6th bullets you have listed under U.2) leave me questioning my thinking though. Am I missing something here? U.3) Is it time to move the link between Item and Band up to Item and Party instead? With the current design I don't see a possibility to sell merchandise not tied to the band as you have brought up. U.5) I left as per your input rather than making it a discrete Supertype/Subtype Relationship as I don’t see a benefit of having that type of roll up. Additional Revisions AR.1) After going through the exercise for FavoriteItem, I feel that Item to Review requires a many-to-many relationship so that is indicated. Necessary? Ok here we go for v1.3 I took a few days on this version, going back and forth with my design. Once the logical process is complete, as I want to see if I am on the right track, I will go through in depth what I had learned and the troubles I faced as a beginner going through this process. The big point for this version was it took throwing in some Keys to help see what I was missing in the past. Going through the process of doing a matrix proved to be of great help also. Regardless of anything, if it wasn't for the input given by PerformanceDBA I would still be a lost soul wondering in the dark. Who knows my current design might reaffirm that I still am, but I have learned a lot so I am know I at least have a flashlight in my hand. At this point in time I admit that I am still confused about identifying and non-identifying relationships. In my model I had to use non-identifying relationships with non nulls just to join the relationships I wanted to model. In reading a lot on the subject there seems to be a lot of disagreement and indecisiveness on the subject so I did what I thought represented the right things in my model. When to force (identifying) and when to be free (non-identifying)? Anyone have inputs? EDIT V1.4 Ok took the V1.3 inputs and cleaned things up for this V1.4 Currently working on a V1.5 to include attributes.

Read the article

making query from different related tables using codeigniter

- by fatemeh karam

I'm using codeigniter as i mentioned this is a part of my view code foreach($projects_query as $row)// $row indicates the projects { ?> <tr><td><h3><button type="submit" class="button red-gradient glossy" name = "project_click" > <?php echo $row->txtTaskName; ?></button></h3></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr> <?php foreach($tasks_query as $row2) { // if( $row->txtTaskName == "TestProject") if($row->intTaskID == $row2->intInside)// intInside indicades that the current task($row2) is the subset of which task (system , subsystem or project) { if($row2->intSummary == 0)//if the task(the system) is an executable task & doesn't have any subtask: { $query_team_user_id = $this->admin_in_out_model->get_user_team_task_query($row2->intTaskID);//runs the function and generates a query from tbl_userteamtask where intTaskID equals to the selected row's intTaskID foreach($query_team_user_id as $row_teamid) { $query_teamname = $this->admin_in_out_model->get_team_name($row_teamid->intTeamID); $query_fn_ln = $this->admin_in_out_model->get_fn_ln_from_userid($row_teamid->intUserID); foreach($query_teamname as $row_teamname) {?> <tr><td></td><td></td><td><h4> <?php echo $row2->txtTaskName;?></h4></td> <td><b><font color='#F33558'><?php echo $row_teamname->txtTeamName;?></font></b></td> <?php } foreach($query_fn_ln as $row_f_l_name) {?> <td> <?php echo $row_f_l_name->txtFirstname." ".$row_f_l_name->txtLastname;?></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td> <?php }?> </tr> <?php } } else{ ?> <tr><td></td><td></td><td><h4> <?php echo $row2->txtTaskName;?></h4></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr><?php } foreach($tasks_query as $row_subsystems) { if($row_subsystems->intInside == $row2->intTaskID )//if the task is the subtask of a system(it means the task is a subsystem) { if($row_subsystems->intSummary == 0)//if the task is an executable task & doesn't have any subtask: { $query_team_user_id = $this->admin_in_out_model->get_user_team_task_query($row_subsystems->intTaskID); foreach($query_team_user_id as $row_teamid) {?> <tr><?php $query_teamname = $this->admin_in_out_model->get_team_name($row_teamid->intTeamID); $query_fn_ln = $this->admin_in_out_model->get_fn_ln_from_userid($row_teamid->intUserID); foreach($query_teamname as $row_teamname) {?> <td></td><td></td><td><h5><?php echo $row_subsystems->txtTaskName?></h5><br/></td> <td><b><font color='#F33558'><?php echo $row_teamname->txtTeamName;?></font></b></td><?php } foreach($query_fn_ln as $row_f_l_name) {?> <td><?php echo $row_f_l_name->txtFirstname." ".$row_f_l_name->txtLastname;?></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><?php }?> </tr><?php } } else{ ?><tr><td></td><td></td><td><h5><?php echo $row_subsystems->txtTaskName?></h5></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr><?php } foreach($tasks_query as $row_tasks) { if($row_tasks->intInside == $row_subsystems->intTaskID )//if the task is the subtask of a subsystem { if($row_tasks->intSummary == 0)//if the task is an executable task & doesn't have any subtask: { $query_team_user_id = $this->admin_in_out_model->get_user_team_task_query($row_tasks->intTaskID); foreach($query_team_user_id as $row_teamid) {?> <tr><?php $query_teamname = $this->admin_in_out_model->get_team_name($row_teamid->intTeamID); $query_fn_ln = $this->admin_in_out_model->get_fn_ln_from_userid($row_teamid->intUserID); foreach($query_teamname as $row_teamname) {?> <td></td><td></td><td><b><?php echo $row_tasks->txtTaskName;?></b></td> <td><b><font color='#F33558'><?php echo $row_teamname->txtTeamName;?></font></b></td><?php } foreach($query_fn_ln as $row_f_l_name) {?> <td><?php echo $row_f_l_name->txtFirstname." ".$row_f_l_name->txtLastname;?></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><?php }?> </tr><?php } } } } } } } } }?> and in controller i have $projects_query = $this->admin_in_out_model->get_projects(); $tasks_query = $this->admin_in_out_model->get_systems(); $userteamtask = $this->admin_in_out_model->get_user_team_task(); $data['tasks_query'] = $tasks_query; $data['projects_query'] = $projects_query; $this->load->view('project_view',$data); but as you see I'm calling my model functions within the view how can i do something else to do this i mean not calling my model function in my view I have to add that, my model function have parameters these are the model functions: function get_projects() { $this -> db -> select('*'); $this -> db -> from('tbl_task'); $this -> db -> where('intInside','0'); $query = $this->db->get(); return $query->result(); } function get_systems() { $this -> db -> select('*'); $this -> db -> from('tbl_task '); $this -> db -> where('intInside <> ','0'); $query = $this->db->get(); return $query->result(); } function get_user_team_task_query($task_id)//gets information from tbl_userteamtask where the field intTaskID is equal to $task_id { $this -> db -> select('*'); $this -> db -> from('tbl_userteamtask'); $this -> db -> where('intTaskID',$task_id); $query_teamid = $this->db->get(); return $query_teamid->result(); } function get_user_team_task()//gets information from tbl_userteamtask where the field intTaskID is equal to $task_id { $this -> db -> select('*'); $this -> db -> from('tbl_userteamtask'); // $this -> db -> where('intTaskID',$task_id); $query_teamid = $this->db->get(); return $query_teamid->result(); } function get_team_name($query_teamid) { $this -> db -> select('*'); $this -> db -> from('tbl_team'); $this -> db -> where('intTeamID',$query_teamid); $query_teamname = $this->db->get(); return $query_teamname->result(); } function get_user_name($query_userid) { $this -> db -> select('*'); $this -> db -> from('tbl_user'); $this -> db -> where('intUserID',$query_userid); $query_username = $this->db->get(); return $query_username->result(); } function get_fn_ln_from_userid($selected_id) { $this -> db -> select('tbl_user.intUserID, tbl_user.intPersonID,tbl_person.intPersonID,tbl_person.txtFirstname, tbl_person.txtLastname'); $this -> db -> from('tbl_user , tbl_person'); $where = "tbl_user.intPersonID = tbl_person.intPersonID "; $this -> db -> where($where); $this -> db -> where('tbl_user.intUserID', $selected_id); $query = $this -> db -> get();//makes query from DB return $query->result(); } do I have to use subquery ? is this true? i mean can i do this? foreach( $data as $key => $each ) { $data[$key]['team_id'] = $this->get_user_team_task_query( $each['intTaskID'] ); foreach($data[$key]['team_id'] as $key_teamname => $each) { $data[$key_teamname]['team_name'] = $this->get_team_name( $each['intTeamID'] ); } } the model code: foreach( $data as $key => $each ) { $data[$key]['intTaskID'] = $each['intTaskID']; $data[$key]['team_id'] = $this->get_user_team_task_query( $each['intTaskID'] ); foreach($data[$key]['team_id'] as $key => $each) { $data[$key]['team_name'] = $this->get_team_name( $each['intTeamID'] ); #fetching of the teamname and saving in the array $data[$key]['user_name'] = $this->get_fn_ln_from_userid( $each['intUserID'] ); foreach($data[$key]['user_name'] as $key => $each) { $data[$key]['first_name'] = $each['txtFirstname'] ; $data[$key]['last_name'] = $each['txtLastname'] ; } $data[$key]['first_name'] = $data[$key]['first_name']; $data[$key]['last_name'] = $data[$key]['last_name']; } }

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

Problems extracting information from RSS feed description field

- by Graeme

Hi, I've built an iPhone application using the parsing code from the TopSongs sample iPhone application. I've hit a problem though - the feed I'm trying to parse data from doesn't have a separate field for every piece of information (i.e. if it was for a feed about dogs, all the information such as dog type, dog age and dog price is contained in the feed. However, the TopSongs app relies on information having its own tags, so instead of using it uses and . So my question is this. How do I extract this information from the description field so that it can be parsed using the TopSongs parser? Can you somehow extract the dog age, price and type information using Yahoo Pipes and use that RSS feed for the feed? Or is there code that I can add to do it in application? Update: To view the code of my application parser (based on the TopSongs Core Data Apple provided application, see below. Here's a sample of one item from the the actual RSS feed I'm using (the description is longer, and has status,size, and a couple of other fields, but they're all formatted the same.: <item> <title>MOE, MARGRET STREET</title> <description> <b>District/Region:</b> REGION 09</br><b>Location:</b> MOE</br><b>Name:</b> MARGRET STREET</br></description> <pubDate>Thu,11 Mar 2010 05:43:03 GMT</pubDate> <guid>1266148</guid> </item> /* File: iTunesRSSImporter.m Abstract: Downloads, parses, and imports the iTunes top songs RSS feed into Core Data. Version: 1.1 Disclaimer: IMPORTANT: This Apple software is supplied to you by Apple Inc. ("Apple") in consideration of your agreement to the following terms, and your use, installation, modification or redistribution of this Apple software constitutes acceptance of these terms. If you do not agree with these terms, please do not use, install, modify or redistribute this Apple software. In consideration of your agreement to abide by the following terms, and subject to these terms, Apple grants you a personal, non-exclusive license, under Apple's copyrights in this original Apple software (the "Apple Software"), to use, reproduce, modify and redistribute the Apple Software, with or without modifications, in source and/or binary forms; provided that if you redistribute the Apple Software in its entirety and without modifications, you must retain this notice and the following text and disclaimers in all such redistributions of the Apple Software. Neither the name, trademarks, service marks or logos of Apple Inc. may be used to endorse or promote products derived from the Apple Software without specific prior written permission from Apple. Except as expressly stated in this notice, no other rights or licenses, express or implied, are granted by Apple herein, including but not limited to any patent rights that may be infringed by your derivative works or by other works in which the Apple Software may be incorporated. The Apple Software is provided by Apple on an "AS IS" basis. APPLE MAKES NO WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, REGARDING THE APPLE SOFTWARE OR ITS USE AND OPERATION ALONE OR IN COMBINATION WITH YOUR PRODUCTS. IN NO EVENT SHALL APPLE BE LIABLE FOR ANY SPECIAL, INDIRECT, INCIDENTAL OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) ARISING IN ANY WAY OUT OF THE USE, REPRODUCTION, MODIFICATION AND/OR DISTRIBUTION OF THE APPLE SOFTWARE, HOWEVER CAUSED AND WHETHER UNDER THEORY OF CONTRACT, TORT (INCLUDING NEGLIGENCE), STRICT LIABILITY OR OTHERWISE, EVEN IF APPLE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Copyright (C) 2009 Apple Inc. All Rights Reserved. */ #import "iTunesRSSImporter.h" #import "Song.h" #import "Category.h" #import "CategoryCache.h" #import <libxml/tree.h> // Function prototypes for SAX callbacks. This sample implements a minimal subset of SAX callbacks. // Depending on your application's needs, you might want to implement more callbacks. static void startElementSAX(void *context, const xmlChar *localname, const xmlChar *prefix, const xmlChar *URI, int nb_namespaces, const xmlChar **namespaces, int nb_attributes, int nb_defaulted, const xmlChar **attributes); static void endElementSAX(void *context, const xmlChar *localname, const xmlChar *prefix, const xmlChar *URI); static void charactersFoundSAX(void *context, const xmlChar *characters, int length); static void errorEncounteredSAX(void *context, const char *errorMessage, ...); // Forward reference. The structure is defined in full at the end of the file. static xmlSAXHandler simpleSAXHandlerStruct; // Class extension for private properties and methods. @interface iTunesRSSImporter () @property BOOL storingCharacters; @property (nonatomic, retain) NSMutableData *characterBuffer; @property BOOL done; @property BOOL parsingASong; @property NSUInteger countForCurrentBatch; @property (nonatomic, retain) Song *currentSong; @property (nonatomic, retain) NSURLConnection *rssConnection; @property (nonatomic, retain) NSDateFormatter *dateFormatter; // The autorelease pool property is assign because autorelease pools cannot be retained. @property (nonatomic, assign) NSAutoreleasePool *importPool; @end static double lookuptime = 0; @implementation iTunesRSSImporter @synthesize iTunesURL, delegate, persistentStoreCoordinator; @synthesize rssConnection, done, parsingASong, storingCharacters, currentSong, countForCurrentBatch, characterBuffer, dateFormatter, importPool; - (void)dealloc { [iTunesURL release]; [characterBuffer release]; [currentSong release]; [rssConnection release]; [dateFormatter release]; [persistentStoreCoordinator release]; [insertionContext release]; [songEntityDescription release]; [theCache release]; [super dealloc]; } - (void)main { self.importPool = [[NSAutoreleasePool alloc] init]; if (delegate && [delegate respondsToSelector:@selector(importerDidSave:)]) { [[NSNotificationCenter defaultCenter] addObserver:delegate selector:@selector(importerDidSave:) name:NSManagedObjectContextDidSaveNotification object:self.insertionContext]; } done = NO; self.dateFormatter = [[[NSDateFormatter alloc] init] autorelease]; [dateFormatter setDateStyle:NSDateFormatterLongStyle]; [dateFormatter setTimeStyle:NSDateFormatterNoStyle]; // necessary because iTunes RSS feed is not localized, so if the device region has been set to other than US // the date formatter must be set to US locale in order to parse the dates [dateFormatter setLocale:[[[NSLocale alloc] initWithLocaleIdentifier:@"US"] autorelease]]; self.characterBuffer = [NSMutableData data]; NSURLRequest *theRequest = [NSURLRequest requestWithURL:iTunesURL]; // create the connection with the request and start loading the data rssConnection = [[NSURLConnection alloc] initWithRequest:theRequest delegate:self]; // This creates a context for "push" parsing in which chunks of data that are not "well balanced" can be passed // to the context for streaming parsing. The handler structure defined above will be used for all the parsing. // The second argument, self, will be passed as user data to each of the SAX handlers. The last three arguments // are left blank to avoid creating a tree in memory. context = xmlCreatePushParserCtxt(&simpleSAXHandlerStruct, self, NULL, 0, NULL); if (rssConnection != nil) { do { [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]]; } while (!done); } // Display the total time spent finding a specific object for a relationship NSLog(@"lookup time %f", lookuptime); // Release resources used only in this thread. xmlFreeParserCtxt(context); self.characterBuffer = nil; self.dateFormatter = nil; self.rssConnection = nil; self.currentSong = nil; [theCache release]; theCache = nil; NSError *saveError = nil; NSAssert1([insertionContext save:&saveError], @"Unhandled error saving managed object context in import thread: %@", [saveError localizedDescription]); if (delegate && [delegate respondsToSelector:@selector(importerDidSave:)]) { [[NSNotificationCenter defaultCenter] removeObserver:delegate name:NSManagedObjectContextDidSaveNotification object:self.insertionContext]; } if (self.delegate != nil && [self.delegate respondsToSelector:@selector(importerDidFinishParsingData:)]) { [self.delegate importerDidFinishParsingData:self]; } [importPool release]; self.importPool = nil; } - (NSManagedObjectContext *)insertionContext { if (insertionContext == nil) { insertionContext = [[NSManagedObjectContext alloc] init]; [insertionContext setPersistentStoreCoordinator:self.persistentStoreCoordinator]; } return insertionContext; } - (void)forwardError:(NSError *)error { if (self.delegate != nil && [self.delegate respondsToSelector:@selector(importer:didFailWithError:)]) { [self.delegate importer:self didFailWithError:error]; } } - (NSEntityDescription *)songEntityDescription { if (songEntityDescription == nil) { songEntityDescription = [[NSEntityDescription entityForName:@"Song" inManagedObjectContext:self.insertionContext] retain]; } return songEntityDescription; } - (CategoryCache *)theCache { if (theCache == nil) { theCache = [[CategoryCache alloc] init]; theCache.managedObjectContext = self.insertionContext; } return theCache; } - (Song *)currentSong { if (currentSong == nil) { currentSong = [[Song alloc] initWithEntity:self.songEntityDescription insertIntoManagedObjectContext:self.insertionContext]; } return currentSong; } #pragma mark NSURLConnection Delegate methods // Forward errors to the delegate. - (void)connection:(NSURLConnection *)connection didFailWithError:(NSError *)error { [self performSelectorOnMainThread:@selector(forwardError:) withObject:error waitUntilDone:NO]; // Set the condition which ends the run loop. done = YES; } // Called when a chunk of data has been downloaded. - (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data { // Process the downloaded chunk of data. xmlParseChunk(context, (const char *)[data bytes], [data length], 0); } - (void)connectionDidFinishLoading:(NSURLConnection *)connection { // Signal the context that parsing is complete by passing "1" as the last parameter. xmlParseChunk(context, NULL, 0, 1); context = NULL; // Set the condition which ends the run loop. done = YES; } #pragma mark Parsing support methods static const NSUInteger kImportBatchSize = 20; - (void)finishedCurrentSong { parsingASong = NO; self.currentSong = nil; countForCurrentBatch++; // Periodically purge the autorelease pool and save the context. The frequency of this action may need to be tuned according to the // size of the objects being parsed. The goal is to keep the autorelease pool from growing too large, but // taking this action too frequently would be wasteful and reduce performance. if (countForCurrentBatch == kImportBatchSize) { [importPool release]; self.importPool = [[NSAutoreleasePool alloc] init]; NSError *saveError = nil; NSAssert1([insertionContext save:&saveError], @"Unhandled error saving managed object context in import thread: %@", [saveError localizedDescription]); countForCurrentBatch = 0; } } /* Character data is appended to a buffer until the current element ends. */ - (void)appendCharacters:(const char *)charactersFound length:(NSInteger)length { [characterBuffer appendBytes:charactersFound length:length]; } - (NSString *)currentString { // Create a string with the character data using UTF-8 encoding. UTF-8 is the default XML data encoding. NSString *currentString = [[[NSString alloc] initWithData:characterBuffer encoding:NSUTF8StringEncoding] autorelease]; [characterBuffer setLength:0]; return currentString; } @end #pragma mark SAX Parsing Callbacks // The following constants are the XML element names and their string lengths for parsing comparison. // The lengths include the null terminator, to ensure exact matches. static const char *kName_Item = "item"; static const NSUInteger kLength_Item = 5; static const char *kName_Title = "title"; static const NSUInteger kLength_Title = 6; static const char *kName_Category = "category"; static const NSUInteger kLength_Category = 9; static const char *kName_Itms = "itms"; static const NSUInteger kLength_Itms = 5; static const char *kName_Artist = "description"; static const NSUInteger kLength_Artist = 7; static const char *kName_Album = "description"; static const NSUInteger kLength_Album = 6; static const char *kName_ReleaseDate = "releasedate"; static const NSUInteger kLength_ReleaseDate = 12; /* This callback is invoked when the importer finds the beginning of a node in the XML. For this application, out parsing needs are relatively modest - we need only match the node name. An "item" node is a record of data about a song. In that case we create a new Song object. The other nodes of interest are several of the child nodes of the Song currently being parsed. For those nodes we want to accumulate the character data in a buffer. Some of the child nodes use a namespace prefix. */ static void startElementSAX(void *parsingContext, const xmlChar *localname, const xmlChar *prefix, const xmlChar *URI, int nb_namespaces, const xmlChar **namespaces, int nb_attributes, int nb_defaulted, const xmlChar **attributes) { iTunesRSSImporter *importer = (iTunesRSSImporter *)parsingContext; // The second parameter to strncmp is the name of the element, which we known from the XML schema of the feed. // The third parameter to strncmp is the number of characters in the element name, plus 1 for the null terminator. if (prefix == NULL && !strncmp((const char *)localname, kName_Item, kLength_Item)) { importer.parsingASong = YES; } else if (importer.parsingASong && ( (prefix == NULL && (!strncmp((const char *)localname, kName_Title, kLength_Title) || !strncmp((const char *)localname, kName_Category, kLength_Category))) || ((prefix != NULL && !strncmp((const char *)prefix, kName_Itms, kLength_Itms)) && (!strncmp((const char *)localname, kName_Artist, kLength_Artist) || !strncmp((const char *)localname, kName_Album, kLength_Album) || !strncmp((const char *)localname, kName_ReleaseDate, kLength_ReleaseDate))) )) { importer.storingCharacters = YES; } } /* This callback is invoked when the parse reaches the end of a node. At that point we finish processing that node, if it is of interest to us. For "item" nodes, that means we have completed parsing a Song object. We pass the song to a method in the superclass which will eventually deliver it to the delegate. For the other nodes we care about, this means we have all the character data. The next step is to create an NSString using the buffer contents and store that with the current Song object. */ static void endElementSAX(void *parsingContext, const xmlChar *localname, const xmlChar *prefix, const xmlChar *URI) { iTunesRSSImporter *importer = (iTunesRSSImporter *)parsingContext; if (importer.parsingASong == NO) return; if (prefix == NULL) { if (!strncmp((const char *)localname, kName_Item, kLength_Item)) { [importer finishedCurrentSong]; } else if (!strncmp((const char *)localname, kName_Title, kLength_Title)) { importer.currentSong.title = importer.currentString; } else if (!strncmp((const char *)localname, kName_Category, kLength_Category)) { double before = [NSDate timeIntervalSinceReferenceDate]; Category *category = [importer.theCache categoryWithName:importer.currentString]; double delta = [NSDate timeIntervalSinceReferenceDate] - before; lookuptime += delta; importer.currentSong.category = category; } } else if (!strncmp((const char *)prefix, kName_Itms, kLength_Itms)) { if (!strncmp((const char *)localname, kName_Artist, kLength_Artist)) { NSString *string = importer.currentSong.artist; NSArray *strings = [string componentsSeparatedByString: @", "]; //importer.currentSong.artist = importer.currentString; } else if (!strncmp((const char *)localname, kName_Album, kLength_Album)) { importer.currentSong.album = importer.currentString; } else if (!strncmp((const char *)localname, kName_ReleaseDate, kLength_ReleaseDate)) { NSString *dateString = importer.currentString; importer.currentSong.releaseDate = [importer.dateFormatter dateFromString:dateString]; } } importer.storingCharacters = NO; } /* This callback is invoked when the parser encounters character data inside a node. The importer class determines how to use the character data. */ static void charactersFoundSAX(void *parsingContext, const xmlChar *characterArray, int numberOfCharacters) { iTunesRSSImporter *importer = (iTunesRSSImporter *)parsingContext; // A state variable, "storingCharacters", is set when nodes of interest begin and end. // This determines whether character data is handled or ignored. if (importer.storingCharacters == NO) return; [importer appendCharacters:(const char *)characterArray length:numberOfCharacters]; } /* A production application should include robust error handling as part of its parsing implementation. The specifics of how errors are handled depends on the application. */ static void errorEncounteredSAX(void *parsingContext, const char *errorMessage, ...) { // Handle errors as appropriate for your application. NSCAssert(NO, @"Unhandled error encountered during SAX parse."); } // The handler struct has positions for a large number of callback functions. If NULL is supplied at a given position, // that callback functionality won't be used. Refer to libxml documentation at http://www.xmlsoft.org for more information // about the SAX callbacks. static xmlSAXHandler simpleSAXHandlerStruct = { NULL, /* internalSubset */ NULL, /* isStandalone */ NULL, /* hasInternalSubset */ NULL, /* hasExternalSubset */ NULL, /* resolveEntity */ NULL, /* getEntity */ NULL, /* entityDecl */ NULL, /* notationDecl */ NULL, /* attributeDecl */ NULL, /* elementDecl */ NULL, /* unparsedEntityDecl */ NULL, /* setDocumentLocator */ NULL, /* startDocument */ NULL, /* endDocument */ NULL, /* startElement*/ NULL, /* endElement */ NULL, /* reference */ charactersFoundSAX, /* characters */ NULL, /* ignorableWhitespace */ NULL, /* processingInstruction */ NULL, /* comment */ NULL, /* warning */ errorEncounteredSAX, /* error */ NULL, /* fatalError //: unused error() get all the errors */ NULL, /* getParameterEntity */ NULL, /* cdataBlock */ NULL, /* externalSubset */ XML_SAX2_MAGIC, // NULL, startElementSAX, /* startElementNs */ endElementSAX, /* endElementNs */ NULL, /* serror */ }; Thanks.

Developer IT