Mysql Full Text Searching With Php

  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Mysql Full Text Searching With Php as PDF for free.

More details

  • Words: 4,433
  • Pages: 17
MySQL Full−Text Searching with PHP Copyright Notice © 2002 − 2005 − The Web Freaks, INC, PHP Freaks.com All rights reserved. No parts of this work may be reproduced in any form or by any means − graphic, electronic, or mechanical, including photocopying, recording, taping, or information storage and retrieval systems − without the written permission of the publisher. Products that are referred to in this document may be either trademarks and/or registered trademarks of the respective owners. The publisher and the author make no claim to these trademarks. While every precaution has been taken in the preparation of this document, the publisher and the author assume no responsibility for errors or omissions, or for damages resulting from the use of information contained in this document or from the use of programs and source code that may accompany it. In no event shall the publisher and the author be liable for any loss of profit or any other commercial damage caused or alleged to have been caused directly or indirectly by this document. Last Update: Sat, 26 Mar 2005 17:38:28 −0500

PHP Help: MySQL Full−Text Searching with PHP

Table of Contents MySQL Full−Text Searching with PHP...........................................................................................................1 Introduction..............................................................................................................................................1 What is Full−Text Searching?.................................................................................................................1 MySQL Full−Text Search Modes............................................................................................................1 Normal Full−Text Search Mode.......................................................................................................2 Boolean Full−Text Search Mode......................................................................................................2 Full−Text with Query Expansion Mode............................................................................................3 Full−Text Noise Words...........................................................................................................................3 Optional Tweaking of Full−Text Results................................................................................................4 Creating the Database and Structure........................................................................................................5 Creating the Full−Text Index...................................................................................................................6 The PHP Full−Text Search Engine..........................................................................................................6 The PHP Full−Text Search Engine − Breakdown...................................................................................8 searchForm() function..............................................................................................................................8 The PHP Full−Text Search Engine − Breakdown...................................................................................9 The Full−Text Queries...........................................................................................................................11 The Rest of the PHP Full−Text Search Engine Script...........................................................................13 Summary................................................................................................................................................13

i

MySQL Full−Text Searching with PHP Navigate: PHP Tutorials > PHP > Searching with PHP

Author: phpfreak Date: 03/26/2005 Version 1.0 Experience Level: Advanced

Introduction When you start designing a really cool site and you desire to have search functionality, you should seriously consider using the MySQL Full−Text feature. The advantages of Full−Text searching outweigh the standard search querying by selecting fields from tables using the LIKE clause. For example: MySQL Example: SELECT field_1, field2, field3 FROM mytable WHERE field1 LIKE '%my_seach_string%'

If you have ever created a search page using similar methods to the previous search string, you will know that the search results can swing from accurate to less accurate very easily. It's very difficult to organize the results by any type of relevance, and that's why we're discussing Full−Text searching right now! Using the MySQL Full−Text searching, you can organize the results by relevance on the query automatically.

What is Full−Text Searching? Full−Text is a type of index created on a table. It was originally designed for larger tables with lots of data. The Full−Text index basically creates a huge index of all of the words in your table. Each entry in this index references a row in your table. These indexes are created in separate index files by MySQL. When you perform a search against the Full−Text index, the MySQL Full−Text search functions use this index to return the related rows from the table back through it's result set.

MySQL Full−Text Search Modes Currently, MySQL supports three modes of Full−Text Searching. Let's review them: MySQL Full−Text Searching with PHP

1

PHP Help: MySQL Full−Text Searching with PHP

Normal Full−Text Search Mode This mode is known as Full−Text Search. I just call it normal to help distinguish between the modes :). This mode simply takes the user input and performs a Full−Text search against it. Nothing too fancy compared to to the other modes.

Boolean Full−Text Search Mode This mode is the most interesting. It basically allows the user to alter the way the search words are handled. For example, if I wanted to search for "php mysql" and I wanted the word "mysql" to be weighed more, I could use the + operator to enhance the results for that particular word. If I wanted to search all results for PHP but exclude all of the ones that had the word "mysql" in them, I would input "php −mysql" and using the minus operator, any result with that word will be gone! There are quite a few operators for use with Boolean Full−Text mode, so let's review those: Booeal Full−Text Search Mode Operators (From the MySQL Manual) By default (when neither + nor − is specified) the word is optional, but the rows that contain it are None rated higher. This mimics the behavior of MATCH() ... AGAINST() without the IN BOOLEAN MODE modifier. +

A leading plus sign indicates that this word must be present in every row returned.



A leading minus sign indicates that this word must not be present in any row returned.

><

These two operators are used to change a word's contribution to the relevance value that is assigned to a row. The > operator increases the contribution and the < operator decreases it. See the example below.

()

Parentheses are used to group words into subexpressions. Parenthesized groups can be nested.

~

A leading tilde acts as a negation operator, causing the word's contribution to the row relevance to be negative. It's useful for marking noise words. A row that contains such a word is rated lower than others, but is not excluded altogether, as it would be with the − operator.

"

A phrase that is enclosed within double quote ('"') characters matches only rows that contain the phrase literally, as it was typed. The full−text engine splits the phrase into words, performs a search in the FULLTEXT index for the words. Before MySQL 5.0.3, the engine then performs a substring search for the phrase in the records that are found, so the match must include non−word characters in the phrase. As of MySQL 5.0.3, non−word characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase" as of MySQL 5.0.3, but not before.

Booelan Full−Text Search Examples (From the MySQL Manual) Normal Full−Text Search Mode

2

PHP Help: MySQL Full−Text Searching with PHP apple banana

Find rows that contain at least one of the two words.

+apple +juice

Find rows that contain both words.

+apple macintosh

Find rows that contain the word ``apple'', but rank rows higher if they also contain ``macintosh''.

+apple −macintosh

Find rows that contain the word ``apple'' but not ``macintosh''.

+apple +(>turnover <strudel)

Find rows that contain the words ``apple'' and ``turnover'', or ``apple'' and ``strudel'' (in any order), but rank ``apple turnover'' higher than ``apple strudel''.

apple*

Find rows that contain words such as ``apple'', ``apples'', ``applesauce'', or ``applet''.

"some words"

Find rows that contain the exact phrase ``some words'' (for example, rows that contain ``some words of wisdom'' but not ``some noise words''). Note that the '"' characters that surround the phrase are operator characters that delimit the phrase. They are not the quotes that surround the search string itself.

Full−Text with Query Expansion Mode This mode is also interesting but it requires you to have at least MySQL 4.1 installed. Basically this mode allows the query to automatically be expanded based on relevant results. Generally, this will be used if the query is too short. Here's what the MySQL Manual says about Full−Text with Query Expansion Searches: " As of MySQL 4.1.1, full−text search supports query expansion (in particular, its variant ``blind query expansion''). This is generally useful when a search phrase is too short, which often means that the user is relying on implied knowledge that the full−text search engine usually lacks. For example, a user searching for ``database'' may really mean that ``MySQL'', ``Oracle'', ``DB2'', and ``RDBMS'' all are phrases that should match ``databases'' and should be returned, too. This is implied knowledge." " Blind query expansion (also known as automatic relevance feedback) is enabled by adding WITH QUERY EXPANSION following the search phrase. It works by performing the search twice, where the search phrase for the second search is the original search phrase concatenated with the few top found documents from the first search. Thus, if one of these documents contains the word ``databases'' and the word ``MySQL'', the second search finds the documents that contain the word ``MySQL'' even if they do not contain the word ``database''." Since this feature is very new, we're going to skip over it in this tutorial, however you will have an understanding of how to add WITH QUERY EXPANSION to your queries when you are done reading. Let's move on to understanding Noise Words and tweaking the configuration on the next page.

Full−Text Noise Words Noise words are basically common words that are found in the language. For example: and, if, or, the. MySQL Full−Text has an automatic noise word list built in and words such as "and, the, or, if" and a whole Boolean Full−Text Search Mode

3

PHP Help: MySQL Full−Text Searching with PHP lot more are automatically excluded by default as noise words. The default noiseword list can be changed by creating a file with a list of words, one word per line and change the "ft_stopword_file" setting in the MySQL configuration: Code Example: [mysqld] ft_stopword_file = /path/to/noisewords.txt

The 50% Threshold To enhance the standard Full−Text search results, any word found in your index more than 50% of the time falls into the "50% threshold" and they are treated as noisewords and excluded. This threshold is designed to enhnace the results. However, when you are searching in BOOLEAN mode, this rule is disregarded.

Optional Tweaking of Full−Text Results Of course MySQL includes methods for tweaking your search results. The tweaks will require you to have system level access to make changes to the MySQL configuration file. Maximum and Minimum Search String Lengths The MySQL configuration directive that defines minimal limit to the number of characters a search phrase can use is called "ft_min_word_len" . By default, the value is set at 4 characters. The maximum search string length setting is called "ft_max_word_len" and it is set at 84 characters by default. To change these settings, you want to edit your MySQL configuration file, usually named 'my.cnf'. For Windows users, this file usually in c:/my.cnf and Linux Users /etc/my.cnf. If the file does not exist, you can create it and then restart MySQL after the changes have been made. Here's an example of these two settings being changed: Code Example: [msyqld] ft_max_word_len = 150 ft_min_word_len = 3

Note: MySQL does not recommend you tweak the FULL−Text search settings because it will alter how their algorithms work. However, if you do change any of these settings, you need to delete and recreate or rebuild the index because each index is created based on these settings. If you do not rebuild / recreate the index, then the search results may fail. Let's move along to creating the database and table structure for our Full−Text Search.

Full−Text Noise Words

4

PHP Help: MySQL Full−Text Searching with PHP

Creating the Database and Structure For this example, we're going to create a database, grant permissions to a user, import the structure and then import 5 rows of random data that I've collected for this example. Let's create the table: MySQL Example: CREATE DATABASE `phpfreaks_search`;

Now let's grant permissions to a user named 'phpfreaks' with a password 'phpfreaks': MySQL Example: GRANT ALL PRIVILEGES ON phpfreaks_search.* TO 'phpfreaks'@'localhost' IDENTIFIED BY 'phpfreaks';

Next, let's import the structure: MySQL Example: CREATE TABLE `mytable` ( `mytable_id` int(25) NOT NULL auto_increment, `mytable_title` varchar(100) NOT NULL default '', `mytable_dts` varchar(25) NOT NULL default '', `mytable_caption` tinytext NOT NULL, `mytable_full_body` text NOT NULL, PRIMARY KEY (`mytable_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 COMMENT='Testing Full Text Search Functionality.';

Let's take a moment and talk about this structure: • mytable_id − Auto increment integer. This is our identifier index. I always use something similar to this to index my tables for reference. • mytable_title − VARCHAR field with up to 100 characters. Used for the title of the article • mytable_dts − This is my Date Timestamp field (dts). It's a VARCHAR 25 field and I always use PHP's time() function to insert the timestamp. • mytable_catpion − TINYTEXT (up to 255 characters) for the "caption". • mytable_full_body − TEXT field up to 65,535 characters which is usually good for most web content!

Ok, now that you understand the structure, you can import the data. I have chosen not to display the content of the SQL data into this tutorial. If you want, you can download my SQL file here.

Creating the Database and Structure

5

PHP Help: MySQL Full−Text Searching with PHP Download SQL File

Creating the Full−Text Index The Full−Text index can be created on CHAR, VARCHAR and any type of TEXT fields. Therefore, our table is suitable for this operation. This portion takes some consideration for the type of searching we want. You have to ask yourself a few questions. Mainly, WHICH FIELDS do you want to search? Based on that, you create an index on those fields together. For example, in our table, the only relevant fields would be mytable_title, mytable_caption and mytable_full_body. Therefore, we create an index on those items combined like this: MySQL Example: CREATE FULLTEXT INDEX full_index ON mytable( mytable_title , mytable_caption , mytable_full_body );

Now you have a Full−Text index using those three fields named "full_index". Now we can start building our PHP search engine!

The PHP Full−Text Search Engine Ahh! Now the moment we've been waiting for! The search engine script! Before you snag this and run, please read the rest of the tutorial. It is important for you to understand how this works. PHP Example: Search
'
'; ''; 'Search for: '; 'Mode: '; '<select name="mode">'; '';

Creating the Full−Text Index

6

PHP Help: MySQL Full−Text Searching with PHP echo echo echo echo

''; ' '; ''; '
';

}

// Create the navigation switch $cmd = (isset($_GET['cmd']) ? $_GET['cmd'] : ''); switch($cmd) { default: echo '

Search Database!

'; searchForm(); break;

case "search": searchForm(); echo '

Search Results:


'; $searchstring = mysql_escape_string($_GET['words']); switch($_GET['mode']) { case "normal": $sql = "SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring') AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring') ORDER BY score DESC"; break; case "boolean": $sql = "SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring' IN BOOLEAN MODE) AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring' IN BOOLEAN MODE) ORDER BY score DESC"; break; } // echo $sql; $result = mysql_query($sql) or die (mysql_error()); while($row = mysql_fetch_object($result)) { echo '<strong>Title: '.stripslashes(htmlspecialchars($row−>mytable_title)).'
'; echo 'Score:'. number_format($row−>score, 1).' Date: '.date('m/d/y', $row−>mytable_dts).'
'; echo '

'.stripslashes(htmlspecialchars($row−>mytable_caption)).'

'; echo '
'; } break; } ?>

Let's break this down on the next page.

The PHP Full−Text Search Engine

7

PHP Help: MySQL Full−Text Searching with PHP

The PHP Full−Text Search Engine − Breakdown Most of this script is standard PHP structuring and formatting. However, I'm going to explain it anyways so you can understand my logic. First, we structure the HTML tags. I am assuming that later on you will wrap this search engine with your own header / footer, so there's no need to explain that right now. PHP Example: Search

Next, we will create the MySQL connection to the database we created earlier in this tutorial. You should already know how to connect to MySQL and you may use your own database connection classes, so I will skip over the details on this one. PHP Example:
searchForm() function Let's discuss this a little in depth. It's just an HTML form that I put into a function so that I can make it portable and place it anywhere on the different pages of this script. PHP Example: // Create the search function: function searchForm() { // Re−usable form // variable setup for the form. $searchwords = (isset($_GET['words']) ? htmlspecialchars(stripslashes($_REQUEST['words'])) : ''); $normal = (($_GET['mode'] == 'normal') ? ' selected="selected"' : '' ); $boolean = (($_GET['mode'] == 'boolean') ? ' selected="selected"' : '' ); echo echo echo echo echo echo echo echo echo echo

'
'; ''; 'Search for: '; 'Mode: '; '<select name="mode">'; ''; ''; ' '; ''; '
';

The PHP Full−Text Search Engine − Breakdown

8

PHP Help: MySQL Full−Text Searching with PHP }

Timeout for Tenerary Operators You may notice my usage of tenerary operators to define my variables. The one thing I hate the most is undefined variables in PHP code, so this is a one line piece of code to define your variables with one value or a default value. Tenerary operators work like this: QUESTION : ANSWER ? NO ANSWER For example, in these examples, let's talk about the $normal variable that I set which will echo selected="selected" in the select menu if the user chose Normal mode in the form. PHP Example: $normal = (($_GET['mode'] == 'normal') ? ' selected="selected"' : '' );

Basically, I am checking to see if the $_GET['mode'] is equal to 'normal' and then I define $normal to 'selected="selected" if the $_GET['mode'] is equal to "normal", otherwise I set $normal to empty. There's three more tenerary operators in this code, so try to grasp these because they can help you tremendously in your programming. Let's move on. In this form, I've defined the form method as GET. I do this for a reason specifically on this form. Normally, I would use POST methods, but I believe that people like to copy a link to search results and post them on forums or instant message them to people. If you use a POST method, the querystring is hidden and the user can't copy their search result link to another place. When you use GET, the form elements are generated into a querystring and visible in the users' URL window on their browser. Just a tip that I wanted to point out. Now, this form is nothing special, except that it recognizes the user's input and when you display the form again, they will be able to alter their last input easily. This is achieved through the tenerary operators and placing the correct values in the form. Another little thing I always try to do when creating forms. This form has three basic inputs • cmd − a hidden field that will direct the user to the correct case in my switch structure later in this code. • words − The text field that the user enters the search phrase in. • mode − The type of mode − Standard or Boolean. Now that you understand the form, let's move on to the switch structure.

The PHP Full−Text Search Engine − Breakdown Let's break down the switch structure that is used for navigation in this script. Using a switch in this manner allows you to use one script for multiple actions accessed by the querystring of the URL. The variable we use searchForm() function

9

PHP Help: MySQL Full−Text Searching with PHP to navigate off of in this case is "cmd". So, if $cmd is empty, the default case is called, othewise we go to whatever case is equal to the value of $cmd. PHP Example: // Create the navigation switch $cmd = (isset($_GET['cmd']) ? $_GET['cmd'] : ''); switch($cmd) {

The default Case The default case is nothing more than the search form. You can make this look however you want and it is displayed if the $cmd variable is not set. So, if the user accesses http://yoursite.com/search.php they'll be bounced here first. In our default case, we simply call the searchForm() function and display the search form. PHP Example: default: echo '

Search Database!

'; searchForm(); break;

The search Case: Ok, this is what we all came here for! Let's discuss what's happening here. When the user submits the form, they are submitting to this case, and only this code is executed. Now, the first thing I want to mention is the usage of mysql_escape_string() . Be sure that you use this function when practical to avoid the user entering malicious input to try and break your MySQL. This code basically escapes anything that can break outside of the original query's intentions and execute other commands on the server. By escaping this code, it puts a backslash in front of the escape characters and prevents them from doing anything stupid. PHP Example: case "search": searchForm(); echo '

Search Results:


'; $searchstring = mysql_escape_string($_GET['words']);

Next, we have a sub−switch in here for the mode (standard or Boolean) so that the user can define which search they want to peform. The normal search just takes the users' input without any Boolean operators and searches against it in normal Full−Text mode. The Boolean mode accepts the boolean operators and then executes the query in Boolean mode. We talked about normal and boolean mode searches earlier in this tutorial, so you should have an understanding of what they are doing. PHP Example: The PHP Full−Text Search Engine − Breakdown

10

PHP Help: MySQL Full−Text Searching with PHP switch($_GET['mode']) { case "normal": $sql = "SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring') AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring') ORDER BY score DESC"; break; case "boolean": $sql = "SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring' IN BOOLEAN MODE) AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring' IN BOOLEAN MODE) ORDER BY score DESC"; break; }

The Full−Text Queries There are two types of Full−Text queries that we are peforming on this script. Each of them have a scoring routine built in so that we can sort our queries by relevance and score. I would like to point out that the score has nothing to do on a 100% scale, so don't even bother trying to sort it out like that. The score simply represents the number of hits inside the query that your term was found. Naturally, the more hits, the more relevance and the higher the score. Normal Mode Query PHP Example: case "normal": $sql = "SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring') AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring') ORDER BY score DESC"; break;

Ok let's talk about the query. In my form, I posted the word "mysql php" and the query that was executed in NORMAL mode was this: MySQL Example:

SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('mysql php') AS score FROM mytable WHERE MATCH(myt php') ORDER BY score DESC

Let's break this query down: MySQL Example: SELECT mytable_id, mytable_title, mytable_caption,

The Full−Text Queries

11

PHP Help: MySQL Full−Text Searching with PHP

This is simply the fields I wanted to select not including the score field. MySQL Example: MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('mysql php') AS score

Now, this portion is still part of the "what" I want to retrieve and it is the score. Notice the MATCH and AGAINST functions that are now available with the Full−Text search. So, in addition to the mytable_id, mytable_title, and mytable_caption fields, I now have 'score' which is the number of hits for this searchterm in the result. Here is where the magic comes in! You will now use the MATCH() and AGAINST() functions available in MysQL's FULL−Text searching to find the results of the Full−Text search: MySQL Example: WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('mysql php') ORDER BY score DESC

In the previous example, we used the Full−Text functions and sorted it by the score returned from the query. It's magic! That was easy wasn't it? Boolean Mode Query PHP Example: case "boolean": $sql = "SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring' IN BOOLEAN MODE) AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('$searchstring' IN BOOLEAN MODE) ORDER BY score DESC"; break;

The Boolean mode query is a bit more interesting. It basically allows you to increase or decrease the value of a word by using special symbols or operators. In my form, I posted the word "mysql −php" and the query that was executed in Boolean mode was this: MySQL Example: SELECT mytable_id, mytable_title, mytable_caption, mytable_dts, MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('mysql −php' IN BOOLEAN MODE) AS score FROM mytable WHERE MATCH(mytable_title, mytable_caption, mytable_full_body) AGAINST ('mysql −php' IN BOOLEAN MODE) ORDER BY score DESC

The Full−Text Queries

12

PHP Help: MySQL Full−Text Searching with PHP

The only difference between this query and the Normal mode is the use of IN BOOLEAN MODE in the AGAINST() function. By enabling IN BOOLEAN MODE we enable MySQL's Full−Text to weigh the results based on the user's preference. If you have any questions about the Boolean mode, please read from the beginning of this tutorial and you will have an understanding of how they work.

The Rest of the PHP Full−Text Search Engine Script PHP Example: // echo $sql; $result = mysql_query($sql) or die (mysql_error()); while($row = mysql_fetch_object($result)) { echo '<strong>Title: '.stripslashes(htmlspecialchars($row−>mytable_title)).'
'; echo 'Score:'. number_format($row−>score, 1).' Date: '.date('m/d/y', $row−>mytable_dts).'
'; echo '

'.stripslashes(htmlspecialchars($row−>mytable_caption)).'

'; echo '
'; } break; } ?>

The rest of the script simply executes the query and then loops through the results using standard MySQL query and result handling. If you're reading this tutorial, I assume that you know how to handle MySQL data querying and displaying results, so I don't need to discuss that here. After the querying is completed, the switch control structure is closed out and so is the HTML.

Summary By reading this tutorial, hopefully you have taken from it a good understanding of MySQL search results and possibly some other PHP / MySQL tips such as table structure design, switch control structures and the beloved tenerary operators. If you have any questions, please post them in the discussion section below this tutorial. Download Source Code Good luck! −phpfreak

The Rest of the PHP Full−Text Search Engine Script

13

PHP Help: MySQL Full−Text Searching with PHP

© Copyright 2002 − 2005 The Web Freaks, INC.

Summary

14

Related Documents