bradino - Developer IT

Screen Scraping Twitter

- by BRADINO

I got an email today asking for help to scrape Twitter. In particular, to be able to login. So I am going to show everyone, NOT to encourage anyone to violate Twitters terms of use but as an educational blog post about how PHP and cURL can be used to post variables and store cookies. Again, I am using the cScrape class I wrote, which you can download. Step 1 First go to twitter.com and look at the source code of the login to get the form field names and the form post location. You will see that the form posts to https://twitter.com/session and the username and password fields are session[username_or_email] and session[password] respectively. Step 2 Now you are ready to login. So using the fetch function in the Scrape class you create an associative array to contain the form values you want to post. The other thing you will need to do is uncomment the lines for CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR. Cookies will be required to stay logged in and scrape around. The paths to the cookie files need to be writable by your app. Also you will need to uncomment the line about CURLOPT_FOLLOWLOCATION. $data = array('session[username_or_email]' => "bradino", 'session[password]' => "secret"); $scrape->fetch('https://twitter.com/sessions',$data); Step 1.5 Oops that didn't work. All I got back was 403 Forbidden: The server understood the request, but is refusing to fulfill it. Ahhh I see another variable called authenticity_token I bet Twitter was looking for that. So let's back up and first hit twitter.com to get the authenticity_token variable, and then make the login post request with that variable included in our array of parameters. $scrape->fetch('https://twitter.com'); $data = array('session[username_or_email]' => "bradino", 'session[password]' => "secret"); $data['authenticity_token'] = $scrape->fetchBetween('name="authenticity_token" type="hidden" value="','"',$scrape->result); $scrape->fetch('https://twitter.com/sessions',$data); echo $scrape->result; So that's basically it. Now you are logged in and can scrape around and request other pages as you normally would. Sorry it wasn't a longer post. I really do enjoy this kind of stuff so if anyone has a request, hit me up. Errors? 1) Make sure that you are properly parsing the token variable 2) Make sure that you uncommented the lines about CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR, those options need to be enabled and be sure the path set is writable by your application 3) Make sure that the path to the cookie file is writable and that it is getting data written to it 4) If you get a message about being redirected you need to uncomment the line about CURLOPT_FOLLOWLOCATION, that option needs to be enabled true

Read the article

URL Multiple Query Parameters Encoded with HTML Entities

- by BRADINO

I came across a situation where a URL with multiple query parameters was encoded using htmlentities() and PHP was not recognizing the query parameters using $_GET. A common case for encoding urls using htmlentities() is to use them inside XML documents. So a url with multiple query parameters, encoded using htmlentities() would look like this: http://www.bradino.com/?color=white&size=medium&quantity=3 and when that url is accessed the second and third query parameters are not recognized because instead of separating the subsequent variables with an & that character gets converted into &. I could not find a good way to resolve this, so basically I just encoded the query string back to normal using html_entity_decode() and then slammed the parameters back into the $_GET array using parse_str(). $query = html_entity_decode($_SERVER['QUERY_STRING']); parse_str($query,$_GET); There must be a better way! Anyone come across this before?

Read the article

Replace Listmenu with Textfield

- by BRADINO

Say you have a dropdown form field where you ask the user how they heard about you. You have numerous options and also an other field where if they selected other, then they enter the value in a text field. This is a cleaner alternative where if the user selects Other, then the dropdown turns into a textfield of the same name, so that your post routine, grooming, validation, writing to DB etc all works seamlessly. This example uses prototype JS library. Here is a working example: How did you hear about BRADINO? Select Google Yahoo MSN Other All you have to do is have an dropdown option for Other and then add call the function onchange: onchange="overrideListmenu('how-heard');" Here is the javascript function that uses Prototype: function overrideListmenu(field){ if ($F(field) == 'Other'){ Element.replace($(field), '<input name="'+field+'" id="'+field+'" type="text" value="">'); $(field).focus(); } } Here is the javascript function that uses jQuery: function overrideListmenu(field){ if ($('#'+field).val() == 'Other'){ $('#'+field).after('<input name="'+field+'" id="'+field+'" type="text" value="">').remove(); $('#'+field).focus(); } } listmenu replace textfield textfield for other how did you hear about us

Read the article

PHP Screen Scraping Class

- by BRADINO

After some positive feedback I have decided to continue to develop the PHP Screen Scraping class. This post will server as the permanent home for the class. Download PHP Screen Scraping Class Updates 20009-07-30 Added setHeader() function

Read the article

Zipcodes in CSV Generation

- by BRADINO

When exporting to CSV format, then opening in a spreadsheet program like Excel zipcodes that start with a zero or zeroes have the preceding zeros stripped off. Obviously it is because the spreadsheet sees that column as integers and preceding zeros in integers are useless. A quick and dirty trick to force Excel (hopefully you are using OpenOffice) to display the full zipcode, we wrap it in double quotes and put an equal sign in front of it, to force it to be a string like this: $zipcode = 00123; $data = '="' . $zipcode . '"' ; So if you are doing the straight query to CSV export, using the fputcsv function it would look something like this. Basically just overwrite the value in the row and then continue along. while ($row = mysql_fetch_assoc($query)){ $row['zipcode'] = '="'.$row['zipcode'].'"'; fputcsv($output, $row); } php csv zipcode csv number csv force string

Read the article

Count Email Address Domains

- by BRADINO

A quick tidbit I came up with today to count email addresses in a mysql database table grouping them by domain. So say for example you have a large list of subscribers and you want to see the breakdown of people who use Hotmail, Yahoo, Gmail, etc. SELECT COUNT( SUBSTRING_INDEX( `email` , '@', -1 ) ) AS `count` , SUBSTRING_INDEX( `email` , '@', -1 ) AS `domain` FROM `subscribers` WHERE `email` != '' GROUP BY `domain` ORDER BY `count` DESC This sql statement assumes that the table is called 'subscribers' and the column containing the email addresses is 'email'. Change these two values to match your table name and email address column name. mysql count email mysql count domain mysql split email mysql split domain

Read the article

CakePHP Missing Database Table Error

- by BRADINO

I am baking a new project management application at work and added a couple new tables to the database today. When I went into the console to bake the new models, they were not in the list... php /path/cake/console/cake.php bake all -app /path/app/ So I manually typed in the model name and I got a missing database table for model error. I checked and double-checked and the database table was named properly. Turns out that some files inside the /app/tmp/cache/ folder were causing Cake not to recognize that I had added new tables to my database. Once I deleted the cache files cake instantly recognized my new database tables and I was baking away! rm -Rf /path/app/tmp/cache/cake*

Read the article

Calculate Age using Date Field

- by BRADINO

So if you have a database table that has DOB borthdays as date fields, this is an easy way to query that table based on age parameters. The following examples assume that the date of birth date field is dob and the table name is people. Find people who are 30 years old SELECT DATE_FORMAT( FROM_DAYS( TO_DAYS( now( ) ) - TO_DAYS( `dob` ) ) , '%Y' ) +0 AS `age` FROM `people` HAVING `age` = 30 Find people who are 31-42 years old SELECT DATE_FORMAT( FROM_DAYS( TO_DAYS( now( ) ) - TO_DAYS( `dob` ) ) , '%Y' ) +0 AS `age` FROM `people` HAVING `age`>= 31 AND `age` <= 42 Find oldest person SELECT MAX(DATE_FORMAT( FROM_DAYS( TO_DAYS( now( ) ) - TO_DAYS( `dob` ) ) , '%Y' ) +0) AS `age` FROM `people` Find youngest person SELECT MIN(DATE_FORMAT( FROM_DAYS( TO_DAYS( now( ) ) - TO_DAYS( `dob` ) ) , '%Y' ) +0) AS `age` FROM `people`

Read the article

MySQL Query Cache

- by BRADINO

According to MySQL.com the query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client. Purely hypothetical example: SELECT `name` FROM `beers` WHERE `favorite` = true To force the database NOT to give you a cached result simply add SQL_NO_CACHE to the query: SELECT SQL_NO_CACHE `name` FROM `beers` WHERE `favorite` = true mysql query cache sql no cache mysql nocache

Read the article

Pass Extra Parameters to JavaScript Callback Function

- by BRADINO

Here is a simple example of a function that takes a callback function as a parameter. query.send(handleQueryResponse); function handleQueryResponse(response){ alert('Processing...'); } If you wanted to pass extra variables to the callback function, you can do it like this. var param1 = 'something'; var param2 ='something else'; query.send(function(response) { handleQueryResponse(response, param1, param2) }); function handleQueryResponse(response,param1,param2){ alert('Processing...'); alert(param1); alert(param2); }

Developer IT