Once of the most important things when developing a website is making sure that it is easy for people to find the information they need. Site maps and site searches are probably the most commonly implemented functionalities for making a sites content easily accessible. Whenever I build a site that is more than just a few pages, I usually create a site map that dynamically generates links to every page on the site. Then I use the script below which reads the sitemap and then crawls the whole site and indexes the content into a verity collection to power my search functionality.
This part especially:
=====
You will see in the code that as the script is crawling each page of the site, it adds "search=Y" to the URLs query string. I set up my sites so that if URL.Search equals "Y", the pages do not display the sites header, footer, or side navigation. This way my verity index only contains the content in the body of the page.
=====
is SO smart!
I found a similar solution for the indexing via sitemap but ended up jumping through a few hoops to strip out everything before and after the main content area of the page, using a comment in the code. Of course, without that comment I'd be out of luck on any given page. Very cool solution here.
Also appreciate the info about the lock and possible corruption. I will be sure to revisit this code when I do my next verity-via-sitemap setup... soon!
What should the sitemap.cfm page contain?
Just a list of links to pages on the site like below?
eg: <a href=index.cfm>home</a>
<a href=index.cfm?pageid=2>about us</a>
<a href=index.cfm?pageid=3>products</a>
<a href=index.cfm?pageid=4>services</a>
<a href=index.cfm?pageid=5>contact us</a>
I look forward to your reply.
Many thanks in advance.
This script is set up to read a sitemap where all the href attributes in the links contain full urls.
<a href='http://www.mysite.com/index.cfm' >home</a>
<a href='http://www.mysite.com/index.cfm?pageid=2" target="_blank">http://www.mysite.com/index.cfm?pageid=2' >about us</a>
<a href='http://www.mysite.com/index.cfm?pageid=3" target="_blank">http://www.mysite.com/index.cfm?pageid=3' >products</a>
<a href='http://www.mysite.com/index.cfm?pageid=4" target="_blank">http://www.mysite.com/index.cfm?pageid=4' >services</a>
<a href='http://www.mysite.com/index.cfm?pageid=5" target="_blank">http://www.mysite.com/index.cfm?pageid=5' >contact us</a>
However in indexsite.cfm you can change cfhttp tag that reads the sitemap to set the resolveurl attribute to "yes" and then cfhttp will change all your relative links into full urls.
<cfhttp url="http://www.mywebsite.com/sitemap.cfm" resolveurl="Yes" method="GET"></cfhttp>
Hopefully this provides me with a effective solution.
I'll post back and let you know how it works or if I have any other questions.
When testing the search, it doesn't seem to be searching the contents/body of the pages. It only returns results where the search term used matches what is in the page s <title></title>.
How do I get it to search the body as well?
Also, I cannot display what is stored as "body" and "URLpath".
Sorry I am sounding like such a newbie... this is my first time using Verity, normally I use queries across multiple tables, which is fairly slow.
Thanks in advance for all your help.
Welcome to the site http://www.queentorrent.com
Here you can download a lot of interesting information.