Posts Tagged ‘Friends’

Using the PHP Document Object Model (DOM) to get all page links

Wednesday, January 27th, 2010

Further to the article I wrote about parsing links from a html page, here is a more elegant and accurate solution to getting every link using the Document Object Model (DOM)

/**
 * @author Jay Gilford
 */

/**
 * get_links()
 * 
 * @param string $url
 * @return array
 */
function get_links($url) {
    
    // Create a new DOM Document to hold our webpage structure
    $xml = new DOMDocument();
    
    // Load the url's contents into the DOM (the @ supresses any errors from invalid XML)
    @$xml->loadHTMLFile($url);
    
    // Empty array to hold all links to return
    $links = array();
    
    //Loop through each  and  tag in the dom and add it to the link array
    foreach($xml->getElementsByTagName('a') as $link) {
        $links[] = array('url' => $link->getAttribute('href'), 'text' => $link->nodeValue);
    }
    
    //Return the links
    return $links;
}

The code above is clearly documented as to how it all works. To call the function simply use
$links = get_links('http://www.example.com');
changing the website link to the page you require the links off. You could also expand this code to give you further details for the links such as the no follow attributes and so forth

If you have any questions about this feel free to contact me as always

Also please note that this requires PHP 5 in order for you to be able to use the DOMDocument

how to get all links from a web page

Monday, October 26th, 2009

A question that gets asked all the time on forums is “How do I get all links on a web page” inside of <a> tags, so here’s some code with full commenting for each line

/**
 * @author Jay Gilford
 */

// regular expression pattern to match all links on a page
$pattern = '%]+href="(?P[^"]+)"[^>*]*>(?P[^< ]+)%si';

// Webpage URL to get links from
$url = 'http://www.jaygilford.com/';

// Fetch contents of whole page
$page_content = file_get_contents($url);

// Get all matches of links and put them into the $matches variable
preg_match_all($pattern, $page_content, $matches);

// Variable to hold all of our urls and their text
$urls = array();

// Loop through each array item
foreach($matches['url'] as $k=>$v) {
    // combine the url and text into it's own key for ease of access
    $urls[$k] = array('url' => $v,'text' => $matches['text'][$k]);
}

// For display purposes only to show the contents of $urls
echo print_r($urls, true);

If you have any questions regarding this feel free to contact me. Details can be found on the about page

creating random activation links for downloads

Monday, July 21st, 2008

This article is intended for advanced users. It explains the principles behind creating a download activation link that is completely random and will stay active for 48 hours after a payment through paypal for example is made

You are going to need two files for this to work. The first is going to be the file that creates the link, adds it to a database table and sends the link via email. The second is the file that will parse all incoming links for the download script, and if the link is verified as being correct it will proceed to allow a user to download the file

NOTE: I will not be describing the intricacies of email or the payment via paypal in this article, merely the methods by which you will need to follow in order to achieve a link creation and verification

Part 1 – Creating the activation key

This should go in your script after your payment has been accepted. I have also not included mysql connection details and function either ie the mysql_connect function or the close function. This is in case you are working with multiple databases whilst doing this

The rand_text function is explained here

function rand_text(   $min = 10,
                      $max = 20,
                      $randtext = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890' )
{
    if($min < 1) $min=1;
    $varlen = rand($min,$max);
    $randtextlen = strlen($randtext);
    $text = '';

    for($i=0; $i < $varlen; $i++)
    {
        $text .= substr($randtext, rand(1, $randtextlen), 1);
    }
    return $text;
}

// Some setup values
$tblname = 'tbl_keys'; #table name in database
$keymin = 50; #mimumum key length
$keymax = 80; #maximum key length
$keyurl = 'http://www.domain.com/activate.php?key='; #domain to add key to
$datefield = 'added'; #date field to put insert date into
$keyfield = 'key'; #activation key field
$mailto = 'email@example.com'; #email to send link to
$mailsubject = 'Download activation link'; #email subject line

// create random string
$key = rand_text($keymin,$keymax);
// add it to database with current date
$query = "INSERT INTO
              `{$tblname}`
          SET
              `{$keyfield}` = '{$key}',
              `{$datefield}` = NOW()";
mysql_query($query);

//Add key to activation url template
$keyurl .= $key;

//Create mail message
$message = "Below is your activation link. You have 48 hours in which to use it, after which it will expire

{$keyurl}

Your website name
http://www.yourdomain.com/";

//Mail activation key to the user
mail($mailto, $mailsubject, $message);

So you now have an emailing key generator. Next you will need to make a table in your database (remember to change tbl_keys to the one assigned to $tblname above)

CREATE TABLE `tbl_keys` (
  `id` int(11) NOT NULL auto_increment,
  `added` datetime NOT NULL,
  `key` varchar(100) NOT NULL,
  PRIMARY KEY  (`id`)
)

That's a basic example just for this tutorial. For yours you can add other information such as the download id for the link (ie what the user will download upon clicking the link) plus any other info you wish to store with each link

Part 2 - Creating the key verification and download script

Now we need to create the activate.php script that was in the $keyurl above, to take the key and verify that the key hasn't expired

//Verify a key has been entered
if(!isset($_GET['key']) || strlen($_GET['key'] == 0))
{
	//Redirect to site homepage
	header('Location: /index.php');
}

// Some setup values (same as first script)
$tblname = 'tbl_keys'; #table name in database
$datefield = 'added'; #date field to put insert date into
$keyfield = 'key'; #activation key field

//Assign key to shorter variable for ease of use
$key = $_GET['key'];

//////////////////////////////////////////////////
//CONNECT HERE TO DATABASE USING mysql_connect()//
//////////////////////////////////////////////////

//Remove any nasty characters that might cause SQL Injection
//(removes any characters except a-z and 0-9)
$key = preg_replace('/[^A-Za-z0-9]/','',$key);

//Set up query to run (The 172800 is 48 hours in seconds)
$query = "SELECT
              *
          FROM
              `{$tblname}`
          WHERE
              (unix_timestamp(NOW()) - unix_timestamp(`{$datefield}`)) < 172800
          AND
              `{$keyfield}` = '{$key}'";

//Run the query
$res = mysql_query($query);

//Check that a result was found
if(mysql_num_rows($res) < 1)
{
	//Key not found
	die('

ERROR: KEY INVALID/EXPIRED

'); }else{ //////////////////////////////////////////////////////// //INSERT YOUR CODE HERE FOR WHAT HAPPENS IF THE KEY IS// //CORRECT AND HASNT EXPIRED // //////////////////////////////////////////////////////// }

All that's left is for you to add your mysql connection and also your code for what to do if the link is valid in the above code

If you have any suggestions on how to improve it or any questions, just drop me a line