Configurable PHP Search Engine for MySQL Content
by Kenneth W. Richards - Mon Apr 17, 2006 12:55pm
Just recently, I completed the design and implementation of a fully configurable search engine. The search engine is designed to index content that is contained in a database which holds all of the major content for this website. By defining separate configurations for each database table, I can provide a search index for major content areas such as: articles, blog entries, forum posts, wiki content and even auction listings.
This is based on an earlier work that I did using Active Server Pages. But I did much more than just port the work over to PHP. I did a complete re-design from the ground up. This was because there were some limitations in the old design that I wanted to clean up. I also wanted to add a couple of new features.
One of the major improvements I decided to make was to index word phrases instead of just single words. So I have added support for two and three-word phrases to the indexing process. Although this increases the size of the index by a factor of about 3.5, it is worth the extra storage because it gives the user much greater flexibility and accuracy when performing searches.
Another improvement that I wanted to make was the ability require certain words (using the "+" modifier) or require that a word does not exist in the content (using the "-" modifier). This will also give users more power when attempting to locate content items on the site.
The configuration of the content areas consists of defining the database table where the content is stored; defining the primary key and fields to index; field to use as the title; field to use as a short summary; and a macro used to build the URL to the content page.
One feature planned for the future will be to hold the results of a large search in a results database table. This way, if there are multiple pages in the search results, the user can page through the results without incurring the costly process of re-generating search results each time the user navigates to a new page. It can also store common searches so that the same search search results can be shared by multiple users.
Going forward, I will be looking to make improvements to the performance of the search engine. It is very efficient right now, but as more and more content gets created, it could impact performance. I may also make changes that would display the search results in a table format instead of Google-style search result listings.
You can test out the search engine by going to the Latest News area of the site. There you will see a search box with the label Search Articles. Please try out this new functionality and look for it to be applied to other ares of the site in short order.